date:20131223

Re: An unexplained 10% speed-up with gcc-4.8

2013-12-23 Thread Michael Veksler


On 23/12/13 12:40, Michael Veksler wrote:

Hello All,

When I started using gcc-4.8.1 I was glad to observe a substantial
speed-up of about 10% in my code, as compared with gcc-4.7.3.

Usually, switching to newer compilers has a relatively minor effect
and definitely not a 10% speed-up. Was there anything significant
in gcc-4.8.1 which may explain this dramatic improvement?

My code is C++98 which is compiled with profile-driven optimizations,
with -O2. My target is generic 32 bit Intel architecture. The result 
is run

on Intel Xeon. The application is CPU intensive.


After some more testing, I found out that there is about 12% improvement
even when comparing two executables compiled without profile-driven
optimization.

Unfortunately, the vast improvement is observed only for x86, not for
x86-64. The speed-up on x86-64 is only 2-3%.

Michael.

error in converting macro IS_EXPR_CODE_CLASS() to function

2013-12-23 Thread Prathamesh Kulkarni

IS_EXPR_CODE_CLASS() is called at 18 places within gcc subdirectory,
and except for expr_check(), tree_block(), tree_set_block() all the
other callers pass argument of type enum tree_code_class to
IS_EXPR_CODE_CLASS().

These four callers (expr_check is overloaded) assign value of
TREE_CODE_CLASS() to variable of type char const, and then pass it as
argument to IS_EXPR_CODE_CLASS()

For example: tree_block():
tree
tree_block (tree t)
{
  char const c = TREE_CODE_CLASS (TREE_CODE (t));

  if (IS_EXPR_CODE_CLASS (c))
return LOCATION_BLOCK (t-exp.locus);
  gcc_unreachable ();
  return NULL;
}

Should type of c be changed to const enum tree_code_class instead
(similarly in other callers) ? Also, TREE_CODE_CLASS()'s value is of
type enum tree_code_class.

This gave a compile-error: invalid conversion from ‘char’ to ‘tree_code_class’
when i changed the macro IS_EXPR_CODE_CLASS() to the following function:

static inline bool
IS_EXPR_CODE_CLASS(enum tree_code_class code_class)
{
  return (code_class = tcc_reference)  (code_class = tcc_expression);
}

Thanks and Regards,
Prathamesh

Re: Remove spam in GCC mailing list

2013-12-23 Thread Tae Wong

Someone on Launchpad has suspended ~seotaewong40 as a spammer.

Please enable the account ~seotaewong40.

Log off in Launchpad and email information will be removed.
Log on to Launchpad and email information will be added.

Launchpad has a facility that replaces all email addresses with email
address hidden.

-- 
Tae-Wong Seo
Korea, Republic of

[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294

2013-12-23 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug target/59573] aarch64: commit 07ca5686e64 broken glibc-2.17

2013-12-23 Thread yvan.roux at linaro dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59573

Yvan Roux yvan.roux at linaro dot org changed:

   What|Removed |Added

 CC||yvan.roux at linaro dot org

--- Comment #5 from Yvan Roux yvan.roux at linaro dot org ---

 Yes, I've tried with foundation_v8, and not only it extremely slow, but also
 it fails here. compiling gcc in qemu takes 5hours, but takes one week
 (someone told me) in foudation model.
 
 for another simulator, do you have any suggestion?

is the foundation model failing for the same reason here (i.e. not recognizing
the cmeq instruction) ?

[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294

2013-12-23 Thread bmei at broadcom dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569

--- Comment #8 from Bingfeng Mei bmei at broadcom dot com ---
Sorry for the regression. The assertion happens if storing a constant value
with negative step. Doing permutation of constant is not the best optimization
here. So the easy way to fix is to skip vectorizing this statement in the same
way as before the patch. Or maybe better way is to form a constant vector to
store.

[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294

2013-12-23 Thread bmei at broadcom dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569

--- Comment #9 from Bingfeng Mei bmei at broadcom dot com ---
Seems simple patch is to just bypass permutation for constant operand as
vec_oprnd is a constant vector with identical elements.

Index: tree-vect-stmts.c
===
--- tree-vect-stmts.c   (revision 206176)
+++ tree-vect-stmts.c   (working copy)
@@ -5353,7 +5353,8 @@ vectorizable_store (gimple stmt, gimple_
set_ptr_info_alignment (get_ptr_info (dataref_ptr), align,
misalign);

- if (negative)
+ if (negative
+  !CONSTANT_CLASS_P (gimple_assign_rhs1 (stmt)))
{
  tree perm_mask = perm_mask_for_reverse (vectype);
  tree perm_dest

[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294

2013-12-23 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569

--- Comment #10 from Jakub Jelinek jakub at gcc dot gnu.org ---
(In reply to Bingfeng Mei from comment #9)
 Seems simple patch is to just bypass permutation for constant operand as
 vec_oprnd is a constant vector with identical elements.
 
 Index: tree-vect-stmts.c
 ===
 --- tree-vect-stmts.c   (revision 206176)
 +++ tree-vect-stmts.c   (working copy)
 @@ -5353,7 +5353,8 @@ vectorizable_store (gimple stmt, gimple_
 set_ptr_info_alignment (get_ptr_info (dataref_ptr), align,
 misalign);
 
 - if (negative)
 + if (negative
 +  !CONSTANT_CLASS_P (gimple_assign_rhs1 (stmt)))
 {
   tree perm_mask = perm_mask_for_reverse (vectype);
   tree perm_dest

I think checking dt == vect_constant_def || dt == vect_external_def would be
more appropriate.  But, IMNSHO you don't need to check at the analysis phase
!perm_mask_for_reverse (vectype)
either.

[Bug bootstrap/59583] New: --enable-targets=all --with-cpu=broadwell isn't allowed to configure i686-linux

2013-12-23 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59583

Bug ID: 59583
   Summary: --enable-targets=all --with-cpu=broadwell isn't
allowed to configure i686-linux
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com

I got

# /export/gnu/import/git/gcc/configure --enable-clocale=gnu --with-system-zlib
--with-demangler-in-ld  --enable-shared i686-linux --prefix=/usr/gcc-4.9.0
--with-local-prefix=/usr/local --enable-targets=all --with-cpu=broadwell
--with-fpmath=sse
...
# make bootstrap
...
Unsupported CPU used in --with-cpu=broadwell, supported values:
generic intel atom slm core2 corei7 corei7-avx nocona x86-64 bdver4 bdver3
bdver2 bdver1 btver2 btver1 amdfam10 barcelona k8 opteron athlon64 athlon-fx
athlon64-sse3 k8-sse3 opteron-sse3
make[3]: *** [configure-stage1-gcc] Error 1
make[3]: Leaving directory `/export/build/gnu/gcc-test-32bit/build-i686-linux'
make[2]: *** [stage1-bubble] Error 2
make[2]: Leaving directory `/export/build/gnu/gcc-test-32bit/build-i686-linux'
make[1]: *** [bootstrap] Error 2

[Bug c++/59111] [4.9 Regression] [c++11] ICE on invalid usage of auto in return type

2013-12-23 Thread mpolacek at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59111

--- Comment #3 from Marek Polacek mpolacek at gcc dot gnu.org ---
Author: mpolacek
Date: Mon Dec 23 12:14:56 2013
New Revision: 206177

URL: http://gcc.gnu.org/viewcvs?rev=206177root=gccview=rev
Log:
PR c++/59111
cp/
* search.c (lookup_conversions): Return NULL_TREE if !CLASS_TYPE_P.
testsuite/
* g++.dg/cpp0x/pr59111.C: New test.
* g++.dg/cpp1y/pr59110.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/cpp0x/pr59111.C
trunk/gcc/testsuite/g++.dg/cpp1y/pr59110.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/search.c
trunk/gcc/testsuite/ChangeLog

[Bug lto/59582] LTO discards symbol that defined as weak elsewhere

2013-12-23 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59582

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2013-12-23
 Ever confirmed|0   |1

--- Comment #1 from H.J. Lu hjl.tools at gmail dot com ---
Please try binutils 2.24.

[Bug rtl-optimization/57422] [4.9 Regression] ICE: SIGSEGV in dominated_by_p with custom flags

2013-12-23 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57422

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #7 from H.J. Lu hjl.tools at gmail dot com ---
Please add the testcase.

[Bug rtl-optimization/57422] [4.9 Regression] ICE: SIGSEGV in dominated_by_p with custom flags

2013-12-23 Thread abel at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57422

Andrey Belevantsev abel at gcc dot gnu.org changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Andrey Belevantsev abel at gcc dot gnu.org ---
See the thread in gcc-patches: the test does not make sense as it is very
sensitive on the scheduler decisions -- even now I had to use the exact
reported revision to get the failure.  I have added extra asserts in the
separate commit instead.

[Bug c++/59111] [4.9 Regression] [c++11] ICE on invalid usage of auto in return type

2013-12-23 Thread mpolacek at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59111

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org

--- Comment #4 from Marek Polacek mpolacek at gcc dot gnu.org ---
Fixed.

[Bug middle-end/59584] New: [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily

2013-12-23 Thread hp at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584

Bug ID: 59584
   Summary: [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by
Don't reject TER unnecessarily
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hp at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Host: x86_64-unknown-linux-gnu
Target: cris-axis-elf

This test previously passed, now it fails.
A patch in the revision range (last_known_working:first_known_failing)
206008:206011 exposed or caused this regression.  Since then it fails as
follows:

Running /tmp/hpautotest-gcc1/gcc/gcc/testsuite/gcc.dg/dg.exp ...
...
FAIL: gcc.dg/pr50251.c (internal compiler error)
FAIL: gcc.dg/pr50251.c (test for excess errors)


In gcc.log:
Executing on host: /tmp/hpautotest-gcc1/cris-elf/gccobj/gcc/xgcc
-B/tmp/hpautotest-gcc1/cris-elf/gccobj/gcc/
/tmp/hpautotest-gcc1/gcc/gcc/testsuite/gcc.dg/pr50251.c 
-fno-diagnostics-show-caret -fdiagnostics-color=never   -O2 -S   -isystem
/tmp/hpautotest-gcc1/cris-elf/gccobj/cris-elf/./newlib/targ-include -isystem
/tmp/hpautotest-gcc1/gcc/newlib/libc/include  -o pr50251.s(timeout = 300)
/tmp/hpautotest-gcc1/gcc/gcc/testsuite/gcc.dg/pr50251.c: In function 'main':
/tmp/hpautotest-gcc1/gcc/gcc/testsuite/gcc.dg/pr50251.c:18:1: internal compiler
error: in fixup_args_size_notes, at expr.c:3978
0x698221 fixup_args_size_notes(rtx_def*, rtx_def*, int)
/tmp/hpautotest-gcc1/gcc/gcc/expr.c:3978
0x67aef9 try_split(rtx_def*, rtx_def*, int)
/tmp/hpautotest-gcc1/gcc/gcc/emit-rtl.c:3602
0x886e61 split_insn
/tmp/hpautotest-gcc1/gcc/gcc/recog.c:2850
0x887104 split_all_insns()
/tmp/hpautotest-gcc1/gcc/gcc/recog.c:2940
0x8871d2 rest_of_handle_split_after_reload
/tmp/hpautotest-gcc1/gcc/gcc/recog.c:3889
0x8871d2 execute
/tmp/hpautotest-gcc1/gcc/gcc/recog.c:3918
Please submit a full bug report,
with preprocessed source if appropriate.

(as the test-case is without preprocessing directives no such action necessary)

A few more hints from gdb shows that gcc ties itself in a knot when splitting:
 (set (reg/f:SI 14 sp) (mem/f/c:SI (symbol_ref:SI (p)))
into:
(gdb) call debug_rtx_range (seq, 0)
(insn 33 0 34 (set (reg/f:SI 14 sp)
(symbol_ref:SI (p) var_decl 0x77eb2000 p)) -1
 (nil))

(insn 34 33 0 (set (reg/f:SI 14 sp)
(mem/f/c:SI (reg/f:SI 14 sp) [2 p+0 S4 A8])) -1
 (expr_list:REG_ARGS_SIZE (const_int 0 [0])
(nil)))

(nil)

While this define_split has a bug (by matching sp, allowing to set the stack
temporarily in an inconsistent state by using sp as a temporary for the
symbol), I doubt that's the actual bug causing internal inconsistency within
gcc.  Anyway:

(gdb) r -fpreprocessed pr50251.i -melf -quiet -dumpbase pr50251.c
-auxbase-strip pr50251.s -O2 -version -fno-diagnostics-show-caret
-fdiagnostics-color=never -o pr50251.s
GNU C (GCC) version 4.9.0 20131223 (experimental) [trunk revision 206176]
(cris-elf)
compiled by GNU C version 4.4.4 20100630 (Red Hat 4.4.4-10), GMP
version 4.3.0, MPFR version 2.4.1, MPC version 0.8
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU C (GCC) version 4.9.0 20131223 (experimental) [trunk revision 206176]
(cris-elf)
compiled by GNU C version 4.4.4 20100630 (Red Hat 4.4.4-10), GMP
version 4.3.0, MPFR version 2.4.1, MPC version 0.8
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: cc4b37aa04284e09676146c2c3d35a20

Breakpoint 1, fancy_abort (file=0xd4f878 /tmp/hpautotest-gcc1/gcc/gcc/expr.c,
line=3978, 
function=0xd50d50 fixup_args_size_notes) at
/tmp/hpautotest-gcc1/gcc/gcc/diagnostic.c:1182
1182{
Missing separate debuginfos, use: debuginfo-install glibc-2.11.1-1.x86_64
libgcc-4.4.4-10.fc12.x86_64 libstdc++-4.4.4-10.fc12.x86_64
(gdb) up
#1  0x00698222 in fixup_args_size_notes (prev=0x0, last=value
optimized out, 
end_args_size=value optimized out) at
/tmp/hpautotest-gcc1/gcc/gcc/expr.c:3978
3978  gcc_assert (!saw_unknown);
(gdb) p prev
(gdb) p prev
$1 = (rtx_def *) 0x0
(gdb) p last
$2 = value optimized out
(gdb) up
#2  0x0067aefa in try_split (pat=value optimized out,
trial=0x77ea47e0, last=1)
at /tmp/hpautotest-gcc1/gcc/gcc/emit-rtl.c:3602
3602  fixup_args_size_notes (NULL_RTX, insn_last, INTVAL (XEXP
(note, 0)));
(gdb) p insn_last
$3 = (rtx_def *) 0x77ea4c60
(gdb) p note
$4 = (rtx_def *) 0x77ea2df8
(gdb) pr
(expr_list:REG_ARGS_SIZE (const_int 0 [0])
(nil))
(gdb) call debug_rtx_range ($3, 0)
(insn 34 33 0 (set (reg/f:SI 14 sp)
(mem/f/c:SI (reg/f:SI 14 sp) [2 p+0 S4 A8])) -1
 (expr_list:REG_ARGS_SIZE (const_int 0 [0])
(nil)))

(nil)

(gdb) bt
#0  fancy_abort

[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily

2013-12-23 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584

--- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org ---
Are you sure it didn't fail before r205026 as well, because what my patch did
was essentially restore the old behavior unless strictly necessary (then it
would keep the r205026+ behavior).

[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily

2013-12-23 Thread hp at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584

--- Comment #2 from Hans-Peter Nilsson hp at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #1)
 Are you sure it didn't fail before r205026 as well, because what my patch
 did was essentially restore the old behavior unless strictly necessary (then
 it would keep the r205026+ behavior).

Sounds like you have a good grip on the circumstances. :)
There was no reason to check for earlier failure ranges, but it certainly
failed before and with r205023, started passing with r205046 up until as noted.
So, I guess this will be a low-priority PR, particularly as it uses an odd
builtin-construct very unlikely to be seen in user code - not to mention it
will also be hidden behind a target-specific fix.

[Bug target/59573] aarch64: commit 07ca5686e64 broken glibc-2.17

2013-12-23 Thread dennis.yxun at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59573

--- Comment #6 from Dennis Lan (dlan) dennis.yxun at gmail dot com ---
 (In reply to Yvan Roux from comment #5)
 is the foundation model failing for the same reason here (i.e. not
 recognizing the cmeq instruction) ?

Not exactly, the foundation_v8 got abort while compiling gcc..
and yes, it does recognize the cmeq instruction.

to clarify, the former gcc build log[1] I provided was generated in qemu which
have *no* cmeq support.

I do have a patch[2] for qemu which implement cmeq support (which I tested
passed), yes, could if anyone can review those patches[3]

for the qemu which implement cmeq, it does pass the glibc compilation and
install successfully, but with the new glibc, gcc fail to build executable
image[4]


[1] http://gcc.gnu.org/bugzilla/attachment.cgi?id=31498
[2]
https://github.com/dlanx/qemu/commit/1a9b3a40917c416125f10accba9e531ed91677d4
[3] git://github.com/dlanx/qemu (branch aarch64-1.6, top four patches)

[4] following output from qemu with cmeq implemented
(202940) insn # gcc -v
Using built-in specs.
COLLECT_GCC=/usr/aarch64-unknown-linux-gnu/gcc-bin/4.9.0-pre/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-unknown-linux-gnu/4.9.0-pre/lto
-wrapper
Target: aarch64-unknown-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-4.9.0_pre/work/gcc-4.9.0-999
9/configure --prefix=/usr
--bindir=/usr/aarch64-unknown-linux-gnu/gcc-bin/4.9.0-
pre
--includedir=/usr/lib/gcc/aarch64-unknown-linux-gnu/4.9.0-pre/includ
e --datadir=/usr/share/gcc-data/aarch64-unknown-linux-gnu/4.9.0-pre
--mandir
=/usr/share/gcc-data/aarch64-unknown-linux-gnu/4.9.0-pre/man
--infodir=/usr/
share/gcc-data/aarch64-unknown-linux-gnu/4.9.0-pre/info
--with-gxx-include-d
ir=/usr/lib/gcc/aarch64-unknown-linux-gnu/4.9.0-pre/include/g++-v4
--host=aa
rch64-unknown-linux-gnu --build=aarch64-unknown-linux-gnu --disable-altivec
--di
sable-fixed-point --without-cloog --disable-lto --enable-nls
--without-included-
gettext --with-system-zlib --enable-obsolete --disable-werror
--enable-secureplt
 --disable-multilib --disable-libmudflap --disable-libssp --enable-libgomp
--wit
h-python-dir=/share/gcc-data/aarch64-unknown-linux-gnu/4.9.0-pre/python
--en
able-checking=release --disable-libgcj --enable-libstdcxx-time
--enable-language
s=c,c++,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit
--$
nable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/
--with-pkgversion='Gento
o 4.9.0_pre'
Thread model: posix
gcc version 4.9.0-pre 20130926 (experimental) commit
07ca5686e64d32f7df4ccf4
205d0b914f120da5e (Gentoo 4.9.0_pre)

(202940) insn # cat cmeq_test.c
#include stdio.h 
#include stdlib.h

long long fn(long long val)
{
asm volatile(
fmov   d0, x0\n\t
cmeq   d0, d0, #0\n\t
fmov   x0, d0\n\t
);
}

int main(int argc, char *argv[])
{
long long v = strtoul(argv[1], NULL, 0);
printf(result: 0x%lx, 0x%lx\n, v, fn(v));
return 0;
}
(202940) insn # ./cmeq_test 1
result: 0x1, 0x0
(202940) insn # ./cmeq_test 0
result: 0x0, 0x
(202940) insn # ./cmeq_test 0x00
result: 0x00, 0x0
(202940) insn # gcc -o mytest_v4 mytest_v4.c

/usr/lib/gcc/aarch64-unknown-linux-gnu/4.9.0-pre/../../../../aarch64-unknown-linux-gnu/bin/ld:
error: Cannot change output format whilst linking AArch64 binaries.
collect2: error: ld returned 1 exit status

(the above cmeq_test was built with sane gcc - with 07ca5686e64 reverted)

[Bug sanitizer/59585] Tests failing due to trailing newline

2013-12-23 Thread y.gribov at samsung dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59585

--- Comment #1 from Yury Gribov y.gribov at samsung dot com ---
Created attachment 31503
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31503action=edit
Draft patch

[Bug sanitizer/59585] New: Tests failing due to trailing newline

2013-12-23 Thread y.gribov at samsung dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59585

Bug ID: 59585
   Summary: Tests failing due to trailing newline
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: y.gribov at samsung dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org,
mpolacek at gcc dot gnu.org, tetra2005 at gmail dot com,
v.garbuzov at samsung dot com
  Host: x86_64-unknown-linux-gnu
Target: arm-v7a15-linux-gnueabi

Created attachment 31502
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31502action=edit
Log file

Hi folks,

I've tested ubsan in cross-gcc on ARM platform and got a series of similiar
errors:

FAIL: c-c++-common/ubsan/div-by-zero-1.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none  \
output pattern test, is
/home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:11:5:
runtime error: division by zero
/home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:12:5:
runtime error: division by zero
/home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:13:5:
runtime error: division by zero
/home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:14:5:
runtime error: division by zero
/home/ygribov/gcc/gcc-master/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c:15:5:
runtime error: division by zero, should match division by zero(
|^M
|^M)[^
^M]*division by zero(
|^M
|^M)[^
^M]*division by zero(
|^M
|^M)[^
^M]*division by zero(
|^M
|^M)[^
^M]*division by zero(
|^M
|^M)

Extract from log file attached.

[Bug middle-end/59569] [4.9 Regression] r206148 causes internal compiler error: in vect_create_destination_var, at tree-vect-data-refs.c:4294

2013-12-23 Thread meibf at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569

--- Comment #11 from meibf at gcc dot gnu.org ---
Author: meibf
Date: Mon Dec 23 15:07:58 2013
New Revision: 206179

URL: http://gcc.gnu.org/viewcvs?rev=206179root=gccview=rev
Log:
2013-12-23  Bingfeng Mei  b...@broadcom.com

PR middle-end/59569
* tree-vect-stmts.c (vectorizable_store): Skip permutation for
consant operand, and add a few missing \n.

* gcc.c-torture/compile/pr59569-1.c: New test.
* gcc.c-torture/compile/pr59569-2.c: Ditto.

Added:
trunk/gcc/testsuite/gcc.c-torture/compile/pr59569-1.c
trunk/gcc/testsuite/gcc.c-torture/compile/pr59569-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-stmts.c

[Bug fortran/59577] OpenMP: ICE with type(c_ptr) in private()

2013-12-23 Thread 06needhamt at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59577

--- Comment #1 from Thomas Needham 06needhamt at gmail dot com ---
Also occurs in version 4.8.2

[Bug fortran/59577] OpenMP: ICE with type(c_ptr) in private()

2013-12-23 Thread dominiq at lps dot ens.fr

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59577

Dominique d'Humieres dominiq at lps dot ens.fr changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-12-23
 Ever confirmed|0   |1

--- Comment #2 from Dominique d'Humieres dominiq at lps dot ens.fr ---
 Also occurs in version 4.8.2

And all versions I have tested down to 4.3.1.

[Bug c++/41090] [4.7/4.8/4.9 Regression] Using static label reference in c++ class constructor produces wrong code

2013-12-23 Thread jason at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41090

--- Comment #19 from Jason Merrill jason at gcc dot gnu.org ---
Author: jason
Date: Mon Dec 23 17:49:47 2013
New Revision: 206182

URL: http://gcc.gnu.org/viewcvs?rev=206182root=gccview=rev
Log:
PR c++/41090
Add -fdeclone-ctor-dtor.
gcc/cp/
* optimize.c (can_alias_cdtor, populate_clone_array): Split out
from maybe_clone_body.
(maybe_thunk_body): New function.
(maybe_clone_body): Call it.
* mangle.c (write_mangled_name): Remove code to suppress
writing of mangled name for cloned constructor or destructor.
(write_special_name_constructor): Handle decloned constructor.
(write_special_name_destructor): Handle decloned destructor.
* method.c (trivial_fn_p): Handle decloning.
* semantics.c (expand_or_defer_fn_1): Clone after setting linkage.
gcc/c-family/
* c.opt: Add -fdeclone-ctor-dtor.
* c-opts.c (c_common_post_options): Default to on iff -Os.
gcc/
* cgraph.h (struct cgraph_node): Add calls_comdat_local.
(symtab_comdat_local_p, symtab_in_same_comdat_p): New.
* cif-code.def: Add USES_COMDAT_LOCAL.
* symtab.c (verify_symtab_base): Make sure we don't refer to a
comdat-local symbol from outside its comdat.
* cgraph.c (verify_cgraph_node): Likewise.
* cgraphunit.c (mark_functions_to_output): Don't mark comdat-locals.
* ipa.c (symtab_remove_unreachable_nodes): Likewise.
(function_and_variable_visibility): Handle comdat-local fns.
* ipa-cp.c (determine_versionability): Don't clone comdat-locals.
* ipa-inline-analysis.c (compute_inline_parameters): Update
calls_comdat_local.
* ipa-inline-transform.c (inline_call): Likewise.
(save_inline_function_body): Don't clear DECL_COMDAT_GROUP.
* ipa-inline.c (can_inline_edge_p): Check calls_comdat_local.
* lto-cgraph.c (input_overwrite_node): Read calls_comdat_local.
(lto_output_node): Write it.
* symtab.c (symtab_dissolve_same_comdat_group_list): Clear
DECL_COMDAT_GROUP for comdat-locals.
include/
* demangle.h (enum gnu_v3_ctor_kinds):
Added literal gnu_v3_unified_ctor.
(enum gnu_v3_ctor_kinds):
Added literal gnu_v3_unified_dtor.
libiberty/
* cp-demangle.c (cplus_demangle_fill_ctor,cplus_demangle_fill_dtor):
Handle unified ctor/dtor.
(d_ctor_dtor_name): Handle unified ctor/dtor.

Added:
trunk/gcc/testsuite/g++.dg/ext/label13a.C
trunk/gcc/testsuite/g++.dg/opt/declone1.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-opts.c
trunk/gcc/c-family/c.opt
trunk/gcc/cgraph.c
trunk/gcc/cgraph.h
trunk/gcc/cgraphunit.c
trunk/gcc/cif-code.def
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/decl.c
trunk/gcc/cp/mangle.c
trunk/gcc/cp/method.c
trunk/gcc/cp/optimize.c
trunk/gcc/cp/semantics.c
trunk/gcc/doc/invoke.texi
trunk/gcc/ipa-cp.c
trunk/gcc/ipa-inline-analysis.c
trunk/gcc/ipa-inline-transform.c
trunk/gcc/ipa-inline.c
trunk/gcc/ipa.c
trunk/gcc/lto-cgraph.c
trunk/gcc/symtab.c
trunk/include/ChangeLog
trunk/include/demangle.h
trunk/libiberty/ChangeLog
trunk/libiberty/cp-demangle.c

[Bug c++/41090] [4.7/4.8/4.9 Regression] Using static label reference in c++ class constructor produces wrong code

2013-12-23 Thread jason at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41090

Jason Merrill jason at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|4.8.3   |4.9.0

--- Comment #20 from Jason Merrill jason at gcc dot gnu.org ---
Fixed for 4.9.

[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily

2013-12-23 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org ---
So is this actually a regression then (I mean, has it worked in 4.8 or 4.7
etc.)?

[Bug c++/59349] [4.9 Regression] ICE on invalid: Segmentation fault toplev.c:336

2013-12-23 Thread jason at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59349

Jason Merrill jason at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||jason at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

[Bug tree-optimization/59586] New: Segmentation fault with -Ofast -floop-parallelize-all -ftree-parallelize-loops

2013-12-23 Thread chaosgate at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59586

Bug ID: 59586
   Summary: Segmentation fault with -Ofast -floop-parallelize-all
-ftree-parallelize-loops
   Product: gcc
   Version: 4.8.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: chaosgate at gmail dot com

Created attachment 31504
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31504action=edit
Test case

This code results in the following segfault when compiled with
gfortran -o /dev/null -c -Ofast -floop-parallelize-all
-ftree-parallelize-loops=1 -fopenmp t3.f


t3.f: In function ‘subsm’:
t3.f:1:0: internal compiler error: Segmentation fault
   subroutine subsm ( n, x, xp, xx)
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.



gcc -v:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/4.8.2/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/paludis/build/sys-devel-gcc-4.8.2/work/gcc-4.8.2/configure
--prefix=/usr --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu
--mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share
--sysconfdir=/etc --localstatedir=/var/lib --disable-silent-rules
--enable-fast-install --libdir=/usr/lib64 --cache-file=config.cache
--libdir=/usr/lib64 --with-pkgversion='exherbo gcc-4.8.2' --program-suffix=-4.8
--disable-bootstrap --enable-clocale=gnu --enable-languages=c,c++,fortran,java
--enable-lto --disable-multilib --enable-nls --enable-serial-configure
--enable-libquadmath --enable-libquadmath-support --with-cloog --enable-libgomp
--disable-libobjc --disable-libssp --with-as=x86_64-pc-linux-gnu-as
--with-ld=x86_64-pc-linux-gnu-ld --with-system-zlib
Thread model: posix
gcc version 4.8.2 (exherbo gcc-4.8.2)

[Bug tree-optimization/59586] Segmentation fault with -Ofast -floop-parallelize-all -ftree-parallelize-loops

2013-12-23 Thread dominiq at lps dot ens.fr

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59586

Dominique d'Humieres dominiq at lps dot ens.fr changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-12-23
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres dominiq at lps dot ens.fr ---
Confirmed for 4.8.2 and trunk with -O3 -ffast-math -floop-parallelize-all, but
no ICE for 4.5.4, 4.6.4, and 4.7.3. Likely a 4.8/4.9 regression. I'll try to
bisect when I find some time.

[Bug target/59587] New: cpu_names in i386.c is accessed with wrong index

2013-12-23 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59587

Bug ID: 59587
   Summary: cpu_names in i386.c is accessed with wrong index
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: ubizjak at gmail dot com

i386.c has

static const char *const cpu_names[TARGET_CPU_DEFAULT_max] =
{
  generic,
...
  btver2
};
...
  if (!opts-x_ix86_tune_string)
{
  opts-x_ix86_tune_string = cpu_names[TARGET_CPU_DEFAULT];
  ix86_tune_defaulted = 1;
}
...
  fprintf (file, %*sarch = %d (%s)\n,
   indent, ,
   ptr-arch,
   ((ptr-arch  TARGET_CPU_DEFAULT_max)
? cpu_names[ptr-arch]
: unknown));

  fprintf (file, %*stune = %d (%s)\n,
   indent, ,
   ptr-tune,
   ((ptr-tune  TARGET_CPU_DEFAULT_max)
? cpu_names[ptr-tune]
: unknown));

But ptr-arch and ptr-tune are set by

  ptr-arch = ix86_arch;
  ptr-schedule = ix86_schedule;
  ptr-tune = ix86_tune;

ix86_arch is set by

/* Which instruction set architecture to use.  */
enum processor_type ix86_arch;

ix86_arch = processor_alias_table[i].processor;

and ix86_tune is set by

/* Which cpu are we optimizing for.  */
enum processor_type ix86_tune;

ix86_tune = processor_alias_table[i].processor;

We are using enum processor_type as index to access array of
enum target_cpu_default

enum target_cpu_default
{
  TARGET_CPU_DEFAULT_generic = 0, 
...
  TARGET_CPU_DEFAULT_max
};

x86 backend only uses TARGET_CPU_DEFAULT_generic to set up the
default tuning:

#ifndef TARGET_CPU_DEFAULT
#define TARGET_CPU_DEFAULT TARGET_CPU_DEFAULT_generic
#endif
...
  if (!opts-x_ix86_tune_string)
{
  opts-x_ix86_tune_string = cpu_names[TARGET_CPU_DEFAULT];
  ix86_tune_defaulted = 1;
}

We never define a different TARGET_CPU_DEFAULT.  When GCC is
configured with --with-arch=/--with-cpu=, we have

[hjl@gnu-6 build-x86_64-linux]$ cat gcc/configargs.h 
/* Generated automatically. */
static const char configuration_arguments[] =
/export/gnu/import/git/gcc/configure --enable-languages=c,c++,fortran
--disable-bootstrap --prefix=/usr/gcc-4.9.0 --with-local-prefix=/usr/local
--enable-gnu-indirect-function --with-fpmath=sse;
static const char thread_model[] = posix;

static const struct {
  const char *name, *value;
} configure_default_options[] = { { cpu, generic }, { arch, x86-64 } };
[hjl@gnu-6 build-x86_64-linux]$ 

which passes -march=/-mtune= to toplev.c.

[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily

2013-12-23 Thread hp at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584

--- Comment #4 from Hans-Peter Nilsson hp at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #3)
 So is this actually a regression then (I mean, has it worked in 4.8 or 4.7
 etc.)?

That's not the definition.  At one point it work on trunk (4.9) thus it's a
regression.

[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily

2013-12-23 Thread hp at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584

--- Comment #5 from Hans-Peter Nilsson hp at gcc dot gnu.org ---
The actual bug causing the ICE is that the combination of
expr.c:find_args_size_adjust and expr.c:fixup_args_size_notes\
 can't handle a define_split matching for the stack-adjustment assignment
instruction emitted by __builtin_stack_restor\
e.

I'm going to mark my commit for the CRIS port with this PR number (since it
fixes the regression per se), but it will j\
ust remove the define_split part happening for the CRIS port; the bug is still
there so the PR should not be closed.
Though, I'll change the title.

[Bug target/59587] cpu_names in i386.c is accessed with wrong index

2013-12-23 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59587

--- Comment #1 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 31505
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31505action=edit
A patch

I am testing this patch.

[Bug middle-end/59584] [4.9 Regression]: gcc.dg/pr50251.c ICE exposed by Don't reject TER unnecessarily

2013-12-23 Thread hp at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584

--- Comment #6 from Hans-Peter Nilsson hp at gcc dot gnu.org ---
Author: hp
Date: Mon Dec 23 22:33:52 2013
New Revision: 206187

URL: http://gcc.gnu.org/viewcvs?rev=206187root=gccview=rev
Log:
PR middle-end/59584
* config/cris/predicates.md (cris_nonsp_register_operand):
New define_predicate.
* config/cris/cris.md: Replace register_operand with
cris_nonsp_register_operand for destinations in all
define_splits where a register is set more than once.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/cris/cris.md
trunk/gcc/config/cris/predicates.md

[Bug middle-end/59584] [4.9 Regression]: cannot handle define_split for insn emitted for __builtin_stack_restore

2013-12-23 Thread hp at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59584

Hans-Peter Nilsson hp at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P3  |P5
Summary|[4.9 Regression]:   |[4.9 Regression]: cannot
   |gcc.dg/pr50251.c ICE|handle define_split for
   |exposed by Don't reject|insn emitted for
   |TER unnecessarily  |__builtin_stack_restore

[Bug target/59588] New: Odd codes in ix86_option_override_internal

2013-12-23 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59588

Bug ID: 59588
   Summary: Odd codes in ix86_option_override_internal
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: ubizjak at gmail dot com

ix86_option_override_internal has

  if (opts-x_ix86_arch_string)
opts-x_ix86_tune_string = opts-x_ix86_arch_string;
  if (!opts-x_ix86_tune_string)
{
  opts-x_ix86_tune_string = cpu_names[TARGET_CPU_DEFAULT];
  ix86_tune_defaulted = 1;
}

  /* opts-x_ix86_tune_string is set to opts-x_ix86_arch_string
 or defaulted.  We need to use a sensible tune option.  */
  if (!strcmp (opts-x_ix86_tune_string, generic)
  || !strcmp (opts-x_ix86_tune_string, x86-64)
  || !strcmp (opts-x_ix86_tune_string, i686))
{
  opts-x_ix86_tune_string = generic;
}

Why is opts-x_ix86_tune_string changed to generic.  If
opts-x_ix86_tune_string is generic. there is no need to
change it to generic.  If an option is valid for -march=,
it should also be valid for -mtune.

[Bug target/59203] config/cris/cris.c:2491: possible typo ?

2013-12-23 Thread hp at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59203

--- Comment #3 from Hans-Peter Nilsson hp at gcc dot gnu.org ---
Author: hp
Date: Mon Dec 23 23:12:09 2013
New Revision: 206188

URL: http://gcc.gnu.org/viewcvs?rev=206188root=gccview=rev
Log:
PR target/59203
* config/cris/cris.c (cris_pic_symbol_type_of): Fix typo,
checking t1 twice instead of t1 and t2 respectively.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/cris/cris.c

[Bug tree-optimization/59586] [4.8/4.9 Regression] [graphite] Segmentation fault with -Ofast -floop-parallelize-all -ftree-parallelize-loops

2013-12-23 Thread dominiq at lps dot ens.fr

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59586

Dominique d'Humieres dominiq at lps dot ens.fr changed:

   What|Removed |Added

 CC||spop at gcc dot gnu.org
Summary|Segmentation fault with |[4.8/4.9 Regression]
   |-Ofast  |[graphite] Segmentation
   |-floop-parallelize-all  |fault with -Ofast
   |-ftree-parallelize-loops|-floop-parallelize-all
   ||-ftree-parallelize-loops

--- Comment #2 from Dominique d'Humieres dominiq at lps dot ens.fr ---
Revision r188914 (2012-06-24) is OK, r189336 (2012-07-06) gives the ICE.

(gdb) bt
#0  0x00010053910b in compute_deps (scop=0x1418bcc00, pbbs=...,
must_raw=0xde650, may_raw=0xfc080, must_raw_no_source=0x141907f50, 
may_raw_no_source=0x141920330, 
must_war=error reading variable: Could not find the frame base for
compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**)., 
may_war=error reading variable: Could not find the frame base for
compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**)., 
must_war_no_source=error reading variable: Could not find the frame base
for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**)., 
may_war_no_source=error reading variable: Could not find the frame base
for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**)., 
must_waw=error reading variable: Could not find the frame base for
compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**)., 
may_waw=error reading variable: Could not find the frame base for
compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**)., 
must_waw_no_source=error reading variable: Could not find the frame base
for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**)., 
may_waw_no_source=error reading variable: Could not find the frame base
for compute_deps(scop*, vecpoly_bb*, va_heap, vl_ptr, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**, isl_union_map**,
isl_union_map**, isl_union_map**, isl_union_map**).) at
../../_clean/gcc/graphite-dependences.c:430
#1  0x00010053965e in loop_is_parallel_p (loop=optimized out,
bb_pbb_mapping=..., depth=optimized out)
at ../../_clean/gcc/graphite-dependences.c:566
#2  0x000100537597 in translate_clast (context_loop=0x1418bcc00,
stmt=0x1419084c0, next_e=0xde650, bb_pbb_mapping=..., level=1099988816, 
ip=0x141920330) at ../../_clean/gcc/graphite-clast-to-gimple.c:1200
#3  0x00010053793c in gloog (scop=optimized out, bb_pbb_mapping=...) at
../../_clean/gcc/graphite-clast-to-gimple.c:1705
#4  0x00010053200f in graphite_transform_loops () at
../../_clean/gcc/graphite.c:304
#5  0x00010053251a in pass_graphite_transforms::execute (this=optimized
out) at ../../_clean/gcc/graphite.c:332
#6  0x00010067c4b9 in execute_one_pass (pass=optimized out) at
../../_clean/gcc/passes.c:2213
#7  0x00010067c74e in execute_pass_list (pass=optimized out) at
../../_clean/gcc/passes.c:2266
#8  0x00010067c760 in execute_pass_list (pass=optimized out) at
../../_clean/gcc/passes.c:2267
#9  0x00010067c760 in execute_pass_list (pass=optimized out) at
../../_clean/gcc/passes.c:2267
#10 0x00010067c760 in execute_pass_list (pass=optimized out) at
../../_clean/gcc/passes.c:2267
#11 0x0001003c66cf in expand_function (node=optimized out) at
../../_clean/gcc/cgraphunit.c:1763

[Bug target/59203] config/cris/cris.c:2491: possible typo ?

2013-12-23 Thread hp at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59203

Hans-Peter Nilsson hp at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Hans-Peter Nilsson hp at gcc dot gnu.org ---
done

[Bug target/59587] cpu_names in i386.c is accessed with wrong index

2013-12-23 Thread hjl.tools at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59587

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

  Attachment #31505|0   |1
is obsolete||

--- Comment #2 from H.J. Lu hjl.tools at gmail dot com ---
Created attachment 31506
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31506action=edit
An updated patch

Test this updated patch.

[Bug fortran/59589] New: Memory leak when deallocating polymorphic

2013-12-23 Thread townsend at astro dot wisc.edu

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59589

Bug ID: 59589
   Summary: Memory leak when deallocating polymorphic
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: townsend at astro dot wisc.edu

Created attachment 31507
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31507action=edit
Test code demonstrating leak

The attached code leaks memory, as indicated by the 'ps' call.

[Bug fortran/58007] [4.7/4.9 Regression] [OOP] ICE in free_pi_tree(): Unresolved fixup - resolve_fixups does not fixup component of __class_bsr_Bsr_matrix

2013-12-23 Thread townsend at astro dot wisc.edu

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58007

--- Comment #11 from Rich Townsend townsend at astro dot wisc.edu ---
#6 fails with 4.9.0 (svn rev. 206179), on both OS X and Linux.

[Bug fortran/59589] Memory leak when deallocating polymorphic

2013-12-23 Thread townsend at astro dot wisc.edu

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59589

--- Comment #1 from Rich Townsend townsend at astro dot wisc.edu ---
Oops, missed out details. This is with rev. 206179, on both OS X and Linux.

[Bug fortran/59589] Memory leak when deallocating polymorphic

2013-12-23 Thread dominiq at lps dot ens.fr

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59589

Dominique d'Humieres dominiq at lps dot ens.fr changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2013-12-24
 Ever confirmed|0   |1

--- Comment #2 from Dominique d'Humieres dominiq at lps dot ens.fr ---
Works for me on OS X for 4.8.2 or trunk. What command are you using?

[Bug c/59590] New: gcc produces an infinite loop on O2 optimization

2013-12-23 Thread cottrell at wfu dot edu

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59590

Bug ID: 59590
   Summary: gcc produces an infinite loop on O2 optimization
   Product: gcc
   Version: 4.8.2
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cottrell at wfu dot edu

Created attachment 31508
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31508action=edit
minimal test case

I'm getting an infinite loop with -O2, though the code is compiled
correctly with just -O.

I'm attaching a minimal test case -- but please see also the real 
function that exposes the problem: the following is the real
counterpart to fake_gradient() in the minimal case:

static int richardson_gradient (double *b, double *g, int n,
BFGS_CRIT_FUNC func, void *data)
{
double df[RSTEPS];
double eps = 1.0e-4;
double d = 0.0001;
double v = 2.0;
double h, p4m;
double bi0, f1, f2;
int r = RSTEPS;
int i, k, m;
int err = 0;

for (i=0; in; i++) {
bi0 = b[i];
h = d * b[i] + eps * (b[i] == 0.0);
for (k=0; kr; k++) {
b[i] = bi0 - h;
f1 = func(b, data);
b[i] = bi0 + h;
f2 = func(b, data);
if (na(f1) || na(f2)) {
b[i] = bi0;
return 1;
}
df[k] = (f2 - f1) / (2.0 * h); 
h /= v;
}
b[i] = bi0;
p4m = 4.0;
for (m=0; mr-1; m++) {
for (k=0; kr-m; k++) {
df[k] = (df[k+1] * p4m - df[k]) / (p4m - 1.0);
// if (k == r-m-1) break;
}
p4m *= 4.0;
}
g[i] = df[0];
}

return err;
}

[Bug fortran/59589] Memory leak when deallocating polymorphic

2013-12-23 Thread townsend at astro dot wisc.edu

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59589

--- Comment #3 from Rich Townsend townsend at astro dot wisc.edu ---
(In reply to Dominique d'Humieres from comment #2)
 Works for me on OS X for 4.8.2 or trunk. What command are you using?

townsend@talos ~ $ gfortran -v
Using built-in specs.
COLLECT_GCC=/Applications/madsdk/bin/gfortran.exec
COLLECT_LTO_WRAPPER=/Applications/madsdk/libexec/gcc/x86_64-apple-darwin11.4.2/4.9.0/lto-wrapper
Target: x86_64-apple-darwin11.4.2
Configured with: ./configure CC='gcc -D_FORTIFY_SOURCE=0'
--build=x86_64-apple-darwin11.4.2 --prefix=/Applications/madsdk
--with-gmp=/Applications/madsdk --with-mpfr=/Applications/madsdk
--with-mpc=/Applications/madsdk --enable-languages=c,c++,fortran
--disable-multilib --disable-nls --disable-libsanitizer
Thread model: posix
gcc version 4.9.0 20131223 (experimental) (GCC) 

townsend@talos ~ $ gfortran -o test_leak test_leak.f90 

townsend@talos ~ $ ./test_leak 
./test_leak  39688
./test_leak  78764
./test_leak 117828
./test_leak 156908
./test_leak 195972
./test_leak 235036
./test_leak 274100
./test_leak 313164
./test_leak 352228
./test_leak 391292

...so, the memory usage grows on each iteration of the loop; this suggests a
leak.

[Bug c/59590] gcc produces an infinite loop on O2 optimization

2013-12-23 Thread pinskia at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59590

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org ---
df[k+1] reads past the bounds of df as k is 0...RSTEPS-1 so k+1 is 1...RSTEPS
and the bounds of df is 0...RSTEPS-1.

[Bug target/59573] aarch64: commit 07ca5686e64 broken glibc-2.17

2013-12-23 Thread dennis.yxun at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59573

--- Comment #7 from Dennis Lan (dlan) dennis.yxun at gmail dot com ---
Ok, it's qemu problem, not gcc.

I've built rootfs in qemu (with cmeq insn implemented), then deploy the rootfs
into foudation_v8 emulator. test to compiles code with gcc, and it works fine,
without the linker error.

[Bug c/59590] gcc produces an infinite loop on O2 optimization

2013-12-23 Thread cottrell at wfu dot edu

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59590

--- Comment #2 from Allin Cottrell cottrell at wfu dot edu ---
OK, you're right, there's an off-by-one bug in the second
k-loop.

But it's not very nice that gcc takes that as a license to
produce an infinite loop. However, I guess that makes this
report a duplicate of some others that have made the same 
observation.

[Bug c++/59349] [4.9 Regression] ICE on invalid: Segmentation fault toplev.c:336

2013-12-23 Thread jason at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59349

--- Comment #3 from Jason Merrill jason at gcc dot gnu.org ---
Author: jason
Date: Tue Dec 24 04:22:15 2013
New Revision: 206192

URL: http://gcc.gnu.org/viewcvs?rev=206192root=gccview=rev
Log:
PR c++/59349
* parser.c (cp_parser_lambda_introducer): Handle empty init.

Added:
trunk/gcc/testsuite/g++.dg/cpp1y/lambda-init7.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/parser.c

[Bug c++/59271] [4.9 Regression] a.C:16:21: internal compiler error: in strip_typedefs, at cp/tree.c:1315

2013-12-23 Thread jason at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59271

--- Comment #4 from Jason Merrill jason at gcc dot gnu.org ---
Author: jason
Date: Tue Dec 24 04:22:23 2013
New Revision: 206193

URL: http://gcc.gnu.org/viewcvs?rev=206193root=gccview=rev
Log:
PR c++/59271
* lambda.c (build_capture_proxy): Use build_cplus_array_type.

Added:
trunk/gcc/testsuite/g++.dg/cpp1y/lambda-generic-vla1.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/lambda.c

[Bug lto/59582] LTO discards symbol that defined as weak elsewhere

2013-12-23 Thread joey.ye at arm dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59582

--- Comment #2 from Joey Ye joey.ye at arm dot com ---
Lastest binutils trunk still has this issue. I'm assuming 2.24 the same.

Re: [PATCH, committed] Fix PR 57422

2013-12-23 Thread Jakub Jelinek

On Mon, Dec 23, 2013 at 10:52:25AM +0400, Andrey Belevantsev wrote:
 As described in the PR, the ICE reason was the typo made when
 introducing calls to add_hard_reg_set.  Fixed by the first attached
 patch, bootstrapped and tested on both ia64 and x86_64, committed as
 obvious.
 
 The test case is very sensitive to the scheduler decisions (e.g. it
 didn't fail on trunk but only on the revision reported for me), so
 instead of adding the test I have put in the code two asserts
 checking that we can always schedule the fence instruction as is.
 This hunk was tested together with the first but committed
 separately.
 
 The first patch can be safely committed to 4.8, the second can stay
 on trunk only.  Jakub, will it be fine with you?

Yes.

Jakub

[C++ Patch ping] Re: [C++ Patch] PR 59165 (aka Core/1442)

2013-12-23 Thread Paolo Carlini


Hi,

assuming I didn't miss anything (I'm still catching up with my emails), 
I'd like to ping the below. Thanks!


Paolo.

///

On 12/10/2013 01:54 PM, Paolo Carlini wrote:

Hi,

as far as I can see, this bug asks for the implementation of 
Core/1442, thus don't do a special Koenig lookup including namespace 
std in cp_parser_perform_range_for_lookup. Tested x86_64-linux.


Thanks,
Paolo.

/

Re: [PATCH, committed] Fix PR 57422

2013-12-23 Thread H.J. Lu

On Sun, Dec 22, 2013 at 10:52 PM, Andrey Belevantsev a...@ispras.ru wrote:
 Hello,

 As described in the PR, the ICE reason was the typo made when introducing
 calls to add_hard_reg_set.  Fixed by the first attached patch, bootstrapped
 and tested on both ia64 and x86_64, committed as obvious.

 The test case is very sensitive to the scheduler decisions (e.g. it didn't
 fail on trunk but only on the revision reported for me), so instead of
 adding the test I have put in the code two asserts checking that we can
 always schedule the fence instruction as is.  This hunk was tested together
 with the first but committed separately.


Testcase is very small. Why not add it?


-- 
H.J.

Re: [PATCH] Don't reject TER unnecessarily (PRs middle-end/58956, middle-end/59470)

2013-12-23 Thread Hans-Peter Nilsson

On Sat, 14 Dec 2013, Jakub Jelinek wrote:
 2013-12-14  Jakub Jelinek  ja...@redhat.com

   PR middle-end/58956
   PR middle-end/59470
   * gimple-walk.h (walk_stmt_load_store_addr_fn): New typedef.
   (walk_stmt_load_store_addr_ops, walk_stmt_load_store_ops): Use it
   for callback params.
   * gimple-walk.c (walk_stmt_load_store_ops): Likewise.
   (walk_stmt_load_store_addr_ops): Likewise.  Adjust all callback
   calls to supply the gimple operand containing the base tree
   as an extra argument.
   * tree-ssa-ter.c: Include gimple-walk.h.
   (find_ssaname, find_ssaname_in_store): New helper functions.
   (find_replaceable_in_bb): For calls or GIMPLE_ASM, only set
   same_root_var if USE is used somewhere in the stores of the stmt.
   * ipa-prop.c (visit_ref_for_mod_analysis): Remove name of the stmt
   argument and ATTRIBUTE_UNUSED, add another unnamed tree argument.
   * ipa-pure-const.c (check_load, check_store, check_ipa_load,
   check_ipa_store): Likewise.
   * gimple.c (gimple_ior_addresses_taken_1, check_loadstore): Likewise.
   * ipa-split.c (test_nonssa_use, mark_nonssa_use): Likewise.
   (verify_non_ssa_vars, visit_bb): Adjust their callers.
   * cfgexpand.c (add_scope_conflicts_1): Use
   walk_stmt_load_store_addr_fn type for visit variable.
   (visit_op, visit_conflict): Remove name of the stmt
   argument and ATTRIBUTE_UNUSED, add another unnamed tree argument.
   * tree-sra.c (asm_visit_addr): Likewise.  Remove name of the data
   argument and ATTRIBUTE_UNUSED.
   * cgraphbuild.c (mark_address, mark_load, mark_store): Add another
   unnamed tree argument.
   * gimple-ssa-isolate-paths.c (check_loadstore): Likewise.  Remove
   ATTRIBUTE_UNUSED from stmt parameter.

Caused PR59584, an ICE.

(I'm going to fix the define_split bug that this exposed, but I
don't think that bug - allowing SP as a temporary - is the cause
of the hiccup.)

brgds, H-P

Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-23 Thread H.J. Lu

On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote:
 On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote:
   Sorry, I must have been looking at an older version, but as I said I
   already did enable it in the latest patch. (see
   http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html )
  
  Sorry for causing another revision but we would like to stick with btver1
  and btver2 rather than BOBCAT or JAGUAR. Therefore the changes would
  be like
  
 I will need to make an updated patch to move the new ISAs to the end of the 
 list anyway. I will send it in a few days to give AMD or Intel developers 
 time 
 to comment on the current version.
 

I renamed Intel processor names. Please update your patch.  Here is my
patch to add more Intel processor support.  You can add it to your
patch.

Thanks.


H.J.
---
From 2ef9b6959a4625d89cab6f06aec6bb2b37095264 Mon Sep 17 00:00:00 2001
From: H.J. Lu hjl.to...@gmail.com
Date: Mon, 23 Dec 2013 05:26:01 -0800
Subject: [PATCH 1/2] Handle haswell and silvermont

---
 ChangeLog.arch   | 18 ++
 gcc/config/i386/i386.c   | 14 ++
 libgcc/config/i386/cpuinfo.c | 15 +++
 3 files changed, 47 insertions(+)
 create mode 100644 ChangeLog.arch

diff --git a/ChangeLog.arch b/ChangeLog.arch
new file mode 100644
index 000..2030a76
--- /dev/null
+++ b/ChangeLog.arch
@@ -0,0 +1,18 @@
+gcc/
+
+2013-12-23   H.J. Lu  hongjiu...@intel.com
+
+   * config/i386/i386.c (get_builtin_code_for_version): Handle
+   PROCESSOR_HASWELL and PROCESSOR_SILVERMONT.
+   (processor_model): Add M_INTEL_COREI7_IVYBRIDGE and
+   M_INTEL_COREI7_HASWELL.
+   (arch_names_table): Add ivybridge, haswell, bonnell,
+   silvermont.
+
+libgcc/
+
+2013-12-23   H.J. Lu  hongjiu...@intel.com
+
+   * config/i386/cpuinfo.c (processor_subtypes): Add
+   INTEL_COREI7_IVYBRIDGE and INTEL_COREI7_HASWELL.
+   (get_intel_cpu): Check Ivy Bridge and Haswell processors.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 2d480b3..d854b5b 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -30058,10 +30058,18 @@ get_builtin_code_for_version (tree decl, tree 
*predicate_list)
   arg_str = sandybridge;
   priority = P_PROC_SSE4_2;
   break;
+   case PROCESSOR_HASWELL:
+ arg_str = haswell;
+ priority = P_PROC_SSE4_2;
+ break;
case PROCESSOR_BONNELL:
  arg_str = bonnell;
  priority = P_PROC_SSSE3;
  break;
+   case PROCESSOR_SILVERMONT:
+ arg_str = silvermont;
+ priority = P_PROC_SSE4_2;
+ break;
case PROCESSOR_AMDFAM10:
  arg_str = amdfam10h;
  priority = P_PROC_SSE4_a;
@@ -30959,6 +30967,8 @@ fold_builtin_cpu (tree fndecl, tree *args)
 M_INTEL_COREI7_NEHALEM,
 M_INTEL_COREI7_WESTMERE,
 M_INTEL_COREI7_SANDYBRIDGE,
+M_INTEL_COREI7_IVYBRIDGE,
+M_INTEL_COREI7_HASWELL,
 M_AMDFAM10H_BARCELONA,
 M_AMDFAM10H_SHANGHAI,
 M_AMDFAM10H_ISTANBUL,
@@ -30984,6 +30994,10 @@ fold_builtin_cpu (tree fndecl, tree *args)
   {nehalem, M_INTEL_COREI7_NEHALEM},
   {westmere, M_INTEL_COREI7_WESTMERE},
   {sandybridge, M_INTEL_COREI7_SANDYBRIDGE},
+  {ivybridge, M_INTEL_COREI7_IVYBRIDGE},
+  {haswell, M_INTEL_COREI7_HASWELL},
+  {bonnell, M_INTEL_BONNELL},
+  {silvermont, M_INTEL_SILVERMONT},
   {amdfam10h, M_AMDFAM10H},
   {barcelona, M_AMDFAM10H_BARCELONA},
   {shanghai, M_AMDFAM10H_SHANGHAI},
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 4b0c189..577881b 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -70,6 +70,8 @@ enum processor_subtypes
   INTEL_COREI7_NEHALEM = 1,
   INTEL_COREI7_WESTMERE,
   INTEL_COREI7_SANDYBRIDGE,
+  INTEL_COREI7_IVYBRIDGE,
+  INTEL_COREI7_HASWELL,
   AMDFAM10H_BARCELONA,
   AMDFAM10H_SHANGHAI,
   AMDFAM10H_ISTANBUL,
@@ -196,6 +198,19 @@ get_intel_cpu (unsigned int family, unsigned int model, 
unsigned int brand_id)
  __cpu_model.__cpu_type = INTEL_COREI7;
  __cpu_model.__cpu_subtype = INTEL_COREI7_SANDYBRIDGE;
  break;
+   case 0x3a:
+   case 0x3e:
+ /* Ivy Bridge.  */
+ __cpu_model.__cpu_type = INTEL_COREI7;
+ __cpu_model.__cpu_subtype = INTEL_COREI7_IVYBRIDGE;
+ break;
+   case 0x3c:
+   case 0x45:
+   case 0x46:
+ /* Haswell.  */
+ __cpu_model.__cpu_type = INTEL_COREI7;
+ __cpu_model.__cpu_subtype = INTEL_COREI7_HASWELL;
+ break;
case 0x17:
case 0x1d:
  /* Penryn.  */
-- 
1.8.4.2

PATCH: PRs bootstrap/59580/59583: Improve x86 --with-arch/--with-cpu= configure handling

2013-12-23 Thread H.J. Lu

On Sun, Dec 22, 2013 at 11:11:12PM +0100, Uros Bizjak wrote:

 Please get someone to review config.gcc changes. They are OK as far as
 x86 rename is concerned, but I can't review functional changes.

Hi Paolo,

Can you review this config.gcc change?

 
  @@ -588,6 +588,22 @@ esac
   # Common C libraries.
   tm_defines=$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3
 
  +# 32-bit x86 processors supported by --with-arch=.  Each processor
  +# MUST be separated by exactly one space.
  +x86_archs=athlon athlon-4 athlon-fx athlon-mp athlon-tbird \
  +athlon-xp k6 k6-2 k6-3 geode c3 c3-2 winchip-c6 winchip2 i386 i486 \
  +i586 i686 pentium pentium-m pentium-mmx pentium2 pentium3 pentium3m \
  +pentium4 pentium4m pentiumpro prescott
 
 Missing native.
 
x86_archs contains 32-bit x86 processors.  native is allowed for
64-bit targets and is included in x86_64_archs.  64-bit processors
can be used in --with-arch/--with-cpu= for 32-bit targets.

Here is a patch to improve x86 x86 --with-arch/--with-cpu= configure
handling.  This patch defines 3 variables:

1. x86_archs: It contains 32-bit x86 processors supported by
--with-arch=, which aren't allowed for 64-bit targets.
2. x86_64_archs: It contains 64-bit x86 processors supported by
--with-arch=, which are allowed for both 32-bit and 64-bit targets.
3. x86_cpus.  It contains x86 processors supported by --with-cpu=,
which are allowed for both 32-bit and 64-bit targets.

Each processor in those 3 variables are separated by exactly one space.

Instead of checking if a value of --with-arch/--with-cpu= is valid in many
difference places with

case ${val} in
valid pattern list)
  OK
  ;;
*)
  error
  exit 1
  ;;
esac

and updating all pattern lists when adding a new processor, this patch
uses

case  valid processor list separated by exactly one space  in
* ${val} *)
  OK
  ;;
*)
  error
  exit 1
  ;;
esac

valid processor list separated by exactly one space is combination
of 3 processor variables.  It only needs separate a check for empty
value with

if test x${val} != x; then
  $val isn't empty
else
  $val is empty
fi

With this approach, we only need to add new 32-bit processors to x86_archs
and new 64-bit processors to x86_64_archs.  They will be supported by
--with-arch/--with-cpu= automatically.  OK to install?

Thanks.


H.J.
---
2013-12-23   H.J. Lu  hongjiu...@intel.com

PR bootstrap/59580
PR bootstrap/59583
* config.gcc (x86_archs): New variable.
(x86_64_archs): Likewise.
(x86_cpus): Likewise.
Use $x86_archs, $x86_64_archs and $x86_cpus to check valid
--with-arch/--with-cpu= options.
Support --with-arch=/--with-cpu={nehalem,westmere,
sandybridge,ivybridge,haswell,broadwell,bonnell,silvermont}.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 24dbaf9..51eb2b1 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -588,6 +588,22 @@ esac
 # Common C libraries.
 tm_defines=$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3
 
+# 32-bit x86 processors supported by --with-arch=.  Each processor
+# MUST be separated by exactly one space.
+x86_archs=athlon athlon-4 athlon-fx athlon-mp athlon-tbird \
+athlon-xp k6 k6-2 k6-3 geode c3 c3-2 winchip-c6 winchip2 i386 i486 \
+i586 i686 pentium pentium-m pentium-mmx pentium2 pentium3 pentium3m \
+pentium4 pentium4m pentiumpro prescott
+# 64-bit x86 processors supported by --with-arch=.  Each processor
+# MUST be separated by exactly one space.
+x86_64_archs=amdfam10 athlon64 athlon64-sse3 barcelona bdver1 bdver2 \
+bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
+core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
+sandybridge ivybridge haswell broadwell bonnell silvermont x86-64 native
+# Additional x86 processors supported by --with-cpu=.  Each processor
+# MUST be separated by exactly one space.
+x86_cpus=generic intel
+
 # Common parts for widely ported systems.
 case ${target} in
 *-*-darwin*)
@@ -1392,20 +1408,21 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | 
i[34567]86-*-knetbsd*-gnu | i
done
TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 
's/^,//'`
need_64bit_isa=yes
-   case X${with_cpu} in
-   
Xgeneric|Xintel|Xatom|Xslm|Xcore2|Xcorei7|Xcorei7-avx|Xnocona|Xx86-64|Xbdver4|Xbdver3|Xbdver2|Xbdver1|Xbtver2|Xbtver1|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx|Xathlon64-sse3|Xk8-sse3|Xopteron-sse3)
 
-   ;;
-   X)
+   if test x$with_cpu = x; then
if test x$with_cpu_64 = x; then
with_cpu_64=generic
fi
-   ;;
-   *)
-   echo Unsupported CPU used in 
--with-cpu=$with_cpu, supported values: 12
-

[PATCH] Fix PR59569

2013-12-23 Thread Bingfeng Mei

Hi, Jakub,

Thanks for suggestion. Please find attached patch. GCC is bootstrapped and 
passes testsuite on x86-64. Let me know if it is OK to commit. (Sorry if you 
received this mail twice as I forgot to set to text format).

Thanks,
Bingfeng Mei


patch_pr59569
Description: patch_pr59569

Re: [PATCH] Fix PR59569

2013-12-23 Thread H.J. Lu

On Mon, Dec 23, 2013 at 6:25 AM, Bingfeng Mei b...@broadcom.com wrote:
 Hi, Jakub,

 Thanks for suggestion. Please find attached patch. GCC is bootstrapped and 
 passes testsuite on x86-64. Let me know if it is OK to commit. (Sorry if you 
 received this mail twice as I forgot to set to text format).


Please test on 3 testcases in the PR and include
some testcases in your patch.

Thanks.

-- 
H.J.

RE: [PATCH] Fix PR59569

2013-12-23 Thread Bingfeng Mei

All the 3 tests are tested and the first two are included in my patch. Didn't
include the third one as it is not reduced.

Bingfeng

-Original Message-
From: H.J. Lu [mailto:hjl.to...@gmail.com] 
Sent: 23 December 2013 14:28
To: Bingfeng Mei
Cc: gcc-patches@gcc.gnu.org; Jakub Jelinek (ja...@redhat.com)
Subject: Re: [PATCH] Fix PR59569

On Mon, Dec 23, 2013 at 6:25 AM, Bingfeng Mei b...@broadcom.com wrote:
 Hi, Jakub,

 Thanks for suggestion. Please find attached patch. GCC is bootstrapped and 
 passes testsuite on x86-64. Let me know if it is OK to commit. (Sorry if you 
 received this mail twice as I forgot to set to text format).

Please test on 3 testcases in the PR and include
some testcases in your patch.

Thanks.

-- 
H.J.

Re: [PATCH] Fix PR59569

2013-12-23 Thread Jakub Jelinek

On Mon, Dec 23, 2013 at 02:23:49PM +, Bingfeng Mei wrote:
 Thanks for suggestion. Please find attached patch. GCC is bootstrapped and
 passes testsuite on x86-64.  Let me know if it is OK to commit.

Ok, thanks.

Would be nice to add runtime testcases for both cases (test whether
vectorization with negative step storing a constant worked properly, and
similarly for external def (e.g. function parameter with
__attribute__((noinline, noclone)) on the function), but that can be done as
a follow-up patch.

Jakub

[COMMITTED]RE: [PATCH] Fix PR59569

2013-12-23 Thread Bingfeng Mei

Committed. I will prepare some new tests as you suggested.

Thanks,
Bingfeng

-Original Message-
From: Jakub Jelinek [mailto:ja...@redhat.com] 
Sent: 23 December 2013 14:53
To: Bingfeng Mei
Cc: gcc-patches@gcc.gnu.org; H.J. Lu (hjl.to...@gmail.com)
Subject: Re: [PATCH] Fix PR59569

On Mon, Dec 23, 2013 at 02:23:49PM +, Bingfeng Mei wrote:
 Thanks for suggestion. Please find attached patch. GCC is bootstrapped and
 passes testsuite on x86-64.  Let me know if it is OK to commit.

Ok, thanks.

Would be nice to add runtime testcases for both cases (test whether
vectorization with negative step storing a constant worked properly, and
similarly for external def (e.g. function parameter with
__attribute__((noinline, noclone)) on the function), but that can be done as
a follow-up patch.

Jakub

[PATCH] Fix for PR59585

2013-12-23 Thread Yury Gribov


Hi folks,

This patch fixes problem with UBSan tests failing on remote target 
platforms (ARM via SSH). The error is caused by DejaGNU harness 
stripping trailing newline from test output (and thus causing pattern 
matching failures).


Link to PR: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59585

-Y
diff --git a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c
index 4e2a2b9..ec391e4 100644
--- a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c
+++ b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-1.c
@@ -21,4 +21,4 @@ main (void)
 /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */
-/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*division by zero } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c
index ee96738..c8820fa 100644
--- a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c
+++ b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-2.c
@@ -20,4 +20,4 @@ main (void)
 /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */
-/* { dg-output \[^\n\r]*division by zero(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*division by zero } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c
index f3ee23b..399071e 100644
--- a/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c
+++ b/gcc/testsuite/c-c++-common/ubsan/div-by-zero-3.c
@@ -18,4 +18,4 @@ main (void)
 
 /* { dg-output division of -2147483648 by -1 cannot be represented in type 'int'(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*division of -2147483648 by -1 cannot be represented in type 'int'(\n|\r\n|\r) } */
-/* { dg-output \[^\n\r]*division of -2147483648 by -1 cannot be represented in type 'int'(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*division of -2147483648 by -1 cannot be represented in type 'int' } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/load-bool-enum.c b/gcc/testsuite/c-c++-common/ubsan/load-bool-enum.c
index db346cb..96f7984 100644
--- a/gcc/testsuite/c-c++-common/ubsan/load-bool-enum.c
+++ b/gcc/testsuite/c-c++-common/ubsan/load-bool-enum.c
@@ -10,8 +10,8 @@ bool b;
 __attribute__((noinline, noclone)) enum A
 foo (bool *p)
 {
-  *p = b;   /* { dg-output load-bool-enum.c:13:\[^\n\r]*runtime error: load of value 4, which is not a valid value for type '(_B|b)ool'(\n|\r\n|\r) } */
-  return a; /* { dg-output \[^\n\r]*load-bool-enum.c:14:\[^\n\r]*runtime error: load of value 9, which is not a valid value for type 'A'(\n|\r\n|\r) { target c++ } } */
+  *p = b;   /* { dg-output load-bool-enum.c:13:\[^\n\r]*runtime error: load of value 4, which is not a valid value for type '(_B|b)ool'(\n|\r\n|\r)* } */
+  return a; /* { dg-output \[^\n\r]*load-bool-enum.c:14:\[^\n\r]*runtime error: load of value 9, which is not a valid value for type 'A'(\n|\r\n|\r)* { target c++ } } */
 }
 
 int
diff --git a/gcc/testsuite/c-c++-common/ubsan/overflow-add-2.c b/gcc/testsuite/c-c++-common/ubsan/overflow-add-2.c
index de2cd2d..f8af828 100644
--- a/gcc/testsuite/c-c++-common/ubsan/overflow-add-2.c
+++ b/gcc/testsuite/c-c++-common/ubsan/overflow-add-2.c
@@ -58,4 +58,4 @@ main (void)
 /* { dg-output \[^\n\r]*signed integer overflow: \[^\n\r]* \\+ 1024 cannot be represented in type 'long int'(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*signed integer overflow: -\[^\n\r]* \\+ -1 cannot be represented in type 'long int'(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*signed integer overflow: -1 \\+ -\[^\n\r]* cannot be represented in type 'long int'(\n|\r\n|\r) } */
-/* { dg-output \[^\n\r]*signed integer overflow: -\[^\n\r]* \\+ -1024 cannot be represented in type 'long int'(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*signed integer overflow: -\[^\n\r]* \\+ -1024 cannot be represented in type 'long int' } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/overflow-mul-2.c b/gcc/testsuite/c-c++-common/ubsan/overflow-mul-2.c
index adcbfe1..ddfbb2e 100644
--- a/gcc/testsuite/c-c++-common/ubsan/overflow-mul-2.c
+++ b/gcc/testsuite/c-c++-common/ubsan/overflow-mul-2.c
@@ -24,4 +24,4 @@ main (void)
 /* { dg-output signed integer overflow: 2147483647 \\* 2 cannot be represented in type 'int'(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*signed integer overflow: 2 \\* 2147483647 cannot be represented in type 'int'(\n|\r\n|\r) } */
 /* { dg-output \[^\n\r]*signed integer overflow: \[^\n\r]* \\* 2 cannot be represented in type 'long int'(\n|\r\n|\r) } */
-/* { dg-output \[^\n\r]*signed integer overflow: 2 \\* \[^\n\r]* cannot be represented in type 'long int'(\n|\r\n|\r) } */
+/* { dg-output \[^\n\r]*signed integer overflow: 2 \\* \[^\n\r]* cannot be represented in type 'long int' } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/overflow-mul-4.c

Re: [PATCH i386 4/8] [AVX512] [5/8] Add substed patterns: rounding subst.

2013-12-23 Thread Uros Bizjak

On Wed, Dec 18, 2013 at 2:00 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,
 On 02 Dec 16:09, Kirill Yukhin wrote:
 Hello,
 On 19 Nov 12:08, Kirill Yukhin wrote:
  Hello,
  On 15 Nov 20:06, Kirill Yukhin wrote:
   Ping.
  Ping.
 Ping.
 Ping.

 Rebased patch in the bottom.

At the end of the day, the patch looks fairly mechanical, adding
extensions to insn templates in a consistent way. The approach with
define_subst is already approved and used throughout the .md files.

I have reviewed the patch, and didn't find any obvious mistakes - and
there is a huge testsuite to find non-obvious ones, so I'm confident
enough to approve the patch.

So, OK for mainline, but I would kindly ask you to please wait a
couple of days for possible Richard's comments

Thanks,
Uros.

Re: [PATCH i386 4/8] [AVX512] [5/8] Add substed patterns: rounding subst.

2013-12-23 Thread Uros Bizjak

On Mon, Dec 23, 2013 at 5:11 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Wed, Dec 18, 2013 at 2:00 PM, Kirill Yukhin kirill.yuk...@gmail.com 
 wrote:
 Hello,
 On 02 Dec 16:09, Kirill Yukhin wrote:
 Hello,
 On 19 Nov 12:08, Kirill Yukhin wrote:
  Hello,
  On 15 Nov 20:06, Kirill Yukhin wrote:
   Ping.
  Ping.
 Ping.
 Ping.

 Rebased patch in the bottom.

 At the end of the day, the patch looks fairly mechanical, adding
 extensions to insn templates in a consistent way. The approach with
 define_subst is already approved and used throughout the .md files.

 I have reviewed the patch, and didn't find any obvious mistakes - and
 there is a huge testsuite to find non-obvious ones, so I'm confident
 enough to approve the patch.

 So, OK for mainline, but I would kindly ask you to please wait a
 couple of days for possible Richard's comments

There is one issue:

+(define_subst_attr round_constraint round vm v)
+(define_subst_attr round_constraint2 round m v)
+(define_subst_attr round_constraint3 round rm r)

When substituting constraints, please also substitute corresponding
operand predicate:

nonimmediate_operand - register_operand in 1st and 3rd case
memory_operand - register_operand in 2nd case.

When you allow e.g. nonimmediate_operand in predicate, but only
register in operand constraint, reload will resolve it, however -
memory load will remain in the loop even if it is invariant. There is
no pass to hoist invariant loads after reload.

Uros.

Re: [PATCH] Fix for PR59585

2013-12-23 Thread Jakub Jelinek

On Mon, Dec 23, 2013 at 07:59:47PM +0400, Yury Gribov wrote:
 Hi folks,
 
 This patch fixes problem with UBSan tests failing on remote target
 platforms (ARM via SSH). The error is caused by DejaGNU harness
 stripping trailing newline from test output (and thus causing
 pattern matching failures).

Sounds like a bug in whatever is stripping the newlines, how else can you
test that the messages aren't on the same lne?
Or is it stripping just the final newline at the end of output?
Still sounds like a bug elsewhere to me.

Jakub

Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-23 Thread Allan Sandfeld Jensen

On Monday 23 December 2013, H.J. Lu wrote:
 On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote:
  On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote:
Sorry, I must have been looking at an older version, but as I said I
already did enable it in the latest patch. (see
http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html )
   
   Sorry for causing another revision but we would like to stick with
   btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore the
   changes would be like
  
  I will need to make an updated patch to move the new ISAs to the end of
  the list anyway. I will send it in a few days to give AMD or Intel
  developers time to comment on the current version.
 
 I renamed Intel processor names. Please update your patch.  Here is my
 patch to add more Intel processor support.  You can add it to your
 patch.
 
Updated patch attached. Rebased, fixed coding style, moved new ISA enums to 
the end and applied H.J.Lu's patch.

`Allan
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 206179)
+++ gcc/config/i386/i386.c	(working copy)
@@ -29970,16 +29970,21 @@
 P_SSE3,
 P_SSSE3,
 P_PROC_SSSE3,
-P_SSE4_a,
-P_PROC_SSE4_a,
+P_SSE4_A,
+P_PROC_SSE4_A,
 P_SSE4_1,
 P_SSE4_2,
 P_PROC_SSE4_2,
 P_POPCNT,
 P_AVX,
+P_PROC_AVX,
+P_FMA4,
+P_XOP,
+P_PROC_XOP,
+P_FMA,
+P_PROC_FMA,
 P_AVX2,
-P_FMA,
-P_PROC_FMA
+P_PROC_AVX2
   };
 
  enum feature_priority priority = P_ZERO;
@@ -29998,11 +30003,15 @@
   {sse, P_SSE},
   {sse2, P_SSE2},
   {sse3, P_SSE3},
+  {sse4a, P_SSE4_A},
   {ssse3, P_SSSE3},
   {sse4.1, P_SSE4_1},
   {sse4.2, P_SSE4_2},
   {popcnt, P_POPCNT},
   {avx, P_AVX},
+  {fma4, P_FMA4},
+  {xop, P_XOP},
+  {fma, P_FMA},
   {avx2, P_AVX2}
 };
 
@@ -30054,26 +30063,50 @@
 	  arg_str = nehalem;
 	  priority = P_PROC_SSE4_2;
 	  break;
-case PROCESSOR_SANDYBRIDGE:
-  arg_str = sandybridge;
-  priority = P_PROC_SSE4_2;
-  break;
+	case PROCESSOR_SANDYBRIDGE:
+	  arg_str = sandybridge;
+	  priority = P_PROC_AVX;
+	  break;
+	case PROCESSOR_HASWELL:
+	  arg_str = haswell;
+	  priority = P_PROC_SSE4_2;
+	  break;
 	case PROCESSOR_BONNELL:
 	  arg_str = bonnell;
 	  priority = P_PROC_SSSE3;
 	  break;
+	case PROCESSOR_SILVERMONT:
+	  arg_str = silvermont;
+	  priority = P_PROC_SSE4_2;
+	  break;
 	case PROCESSOR_AMDFAM10:
 	  arg_str = amdfam10h;
-	  priority = P_PROC_SSE4_a;
+	  priority = P_PROC_SSE4_A;
 	  break;
+	case PROCESSOR_BTVER1:
+	  arg_str = bobcat;
+	  priority = P_PROC_SSE4_A;
+	  break;
+	case PROCESSOR_BTVER2:
+	  arg_str = jaguar;
+	  priority = P_PROC_AVX;
+	  break;
 	case PROCESSOR_BDVER1:
 	  arg_str = bdver1;
-	  priority = P_PROC_FMA;
+	  priority = P_PROC_XOP;
 	  break;
 	case PROCESSOR_BDVER2:
 	  arg_str = bdver2;
 	  priority = P_PROC_FMA;
 	  break;
+	case PROCESSOR_BDVER3:
+	  arg_str = bdver3;
+	  priority = P_PROC_FMA;
+	  break;
+	case PROCESSOR_BDVER4:
+	  arg_str = bdver4;
+	  priority = P_PROC_AVX2;
+	  break;
 	}  
 	}
 
@@ -30938,6 +30971,10 @@
 F_SSE4_2,
 F_AVX,
 F_AVX2,
+F_SSE4_A,
+F_FMA4,
+F_XOP,
+F_FMA,
 F_MAX
   };
 
@@ -30955,6 +30992,10 @@
 M_AMDFAM10H,
 M_AMDFAM15H,
 M_INTEL_SILVERMONT,
+M_INTEL_COREI7_AVX,
+M_INTEL_CORE_AVX2,
+M_AMD_BOBCAT,
+M_AMD_JAGUAR,
 M_CPU_SUBTYPE_START,
 M_INTEL_COREI7_NEHALEM,
 M_INTEL_COREI7_WESTMERE,
@@ -30965,7 +31006,9 @@
 M_AMDFAM15H_BDVER1,
 M_AMDFAM15H_BDVER2,
 M_AMDFAM15H_BDVER3,
-M_AMDFAM15H_BDVER4
+M_AMDFAM15H_BDVER4,
+M_INTEL_COREI7_IVYBRIDGE,
+M_INTEL_CORE_HASWELL
   };
 
   static struct _arch_names_table
@@ -30983,16 +31026,24 @@
   {corei7, M_INTEL_COREI7},
   {nehalem, M_INTEL_COREI7_NEHALEM},
   {westmere, M_INTEL_COREI7_WESTMERE},
+  {corei7-avx, M_INTEL_COREI7_AVX},  
   {sandybridge, M_INTEL_COREI7_SANDYBRIDGE},
+  {ivybridge, M_INTEL_COREI7_IVYBRIDGE},
+  {core-avx2, M_INTEL_CORE_AVX2},
+  {haswell, M_INTEL_CORE_HASWELL},
+  {bonnell, M_INTEL_BONNELL},
+  {silvermont, M_INTEL_SILVERMONT},
   {amdfam10h, M_AMDFAM10H},
   {barcelona, M_AMDFAM10H_BARCELONA},
   {shanghai, M_AMDFAM10H_SHANGHAI},
   {istanbul, M_AMDFAM10H_ISTANBUL},
+  {bobcat, M_AMD_BOBCAT},  
   {amdfam15h, M_AMDFAM15H},
   {bdver1, M_AMDFAM15H_BDVER1},
   {bdver2, M_AMDFAM15H_BDVER2},
   {bdver3, M_AMDFAM15H_BDVER3},
   {bdver4, M_AMDFAM15H_BDVER4},
+  {jaguar, M_AMD_JAGUAR},  
 };
 
   static struct _isa_names_table
@@ -31009,9

Re: [PATCH i386 4/8] [AVX512] [6/8] Add substed patterns: `sae' subst.

2013-12-23 Thread Uros Bizjak

On Wed, Dec 18, 2013 at 2:02 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,

 On 02 Dec 16:10, Kirill Yukhin wrote:
 Hello,
 On 19 Nov 12:11, Kirill Yukhin wrote:
  Hello,
  On 15 Nov 20:07, Kirill Yukhin wrote:
Is it ok for trunk?
   Ping.
  Ping.
 Ping.
 Ping.

 Rebased patch in the bottom.

+(define_subst_attr round_saeonly_constraint round_saeonly vm v)
+(define_subst_attr round_saeonly_constraint2 round_saeonly m v)

The same comment as in previous patch. Please introduce corresponding
predicate substitution that will follow constraint changes.

+(define_subst_attr round_saeonly_mode512bit_condition
round_saeonly 1 (GET_MODE (operands[0]) == V16SFmode || GET_MODE
(operands[0]) == V8DFmode))
+(define_subst_attr round_saeonly_mode512bit_condition_op1
round_saeonly 1 (GET_MODE (operands[1]) == V16SFmode || GET_MODE
(operands[1]) == V8DFmode))

Use MODEmode == ... static checks in above conditions.

The patch is OK for mainline with these changes.

Thanks,
Uros.

[PATCH] [followup to PR59569] new vect tests for store with negative step

2013-12-23 Thread Bingfeng Mei

Hi, 
Here are two vectorization tests for store with negative step. This is 
follow-up to PR59569 fix, which contains two tests for ICE. These tests are for 
vectorization tests and executable. OK to commit? 

Thanks,
Bingfeng


patch_vect_tests
Description: patch_vect_tests

Re: [PATCH i386 4/8] [AVX512] [7/8] Add substed patterns: `round for expand' subst.

2013-12-23 Thread Uros Bizjak

On Wed, Dec 18, 2013 at 2:04 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Hello,

 On 02 Dec 16:11, Kirill Yukhin wrote:
 Hello,
 On 19 Nov 12:12, Kirill Yukhin wrote:
  Hello,
  On 15 Nov 20:08, Kirill Yukhin wrote:
Is it ok for trunk?
   Ping.
  Ping.
 Ping.
 Ping.

 Rebased patch in the bottom.

This round_expand_predicate is the predicate substitution I was
referred to in the review of 5/8. Please use it also in insn patterns,
perhaps renamed as round_predicate, as it is not exclusive to
expanders. As mentioned, predicates should mirror constraints as close
as possible.

OK with these changes,
Uros.

Re: [PATCH] [followup to PR59569] new vect tests for store with negative step

2013-12-23 Thread Jakub Jelinek

On Mon, Dec 23, 2013 at 04:43:17PM +, Bingfeng Mei wrote:
 Here are two vectorization tests for store with negative step. This is
 follow-up to PR59569 fix, which contains two tests for ICE.  These tests
 are for vectorization tests and executable.  OK to commit?

--- testsuite/gcc.dg/vect/vect-neg-store-1.c(revision 0)
+++ testsuite/gcc.dg/vect/vect-neg-store-1.c(revision 0)
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target vect_int } */
+#include stdlib.h
+
+__attribute__((noinline, noclone))
+void test1(short x[128])
+{
+int i;
+for (i=127; i=0; i--) {
+   x[i] = 1234;
+}
+}
+
+int main (void)
+{
+  short x[128];
+  int i;
+  test1 (x);
+  
+  for (i = 0; i  128; i++)
+   if (x[i] != 1234)
+ abort ();

Can you please change both tests so that the x array is say 128+32 elements
long instead of 128, you store some other pattern to the first 16 and last
16 elements in the array before calling test1 (do it say with asm ();
inside of the loop to avoid vectorization), call test1 on x + 16 and
afterwards verify that test1 didn't write anything before or after the
buffer?  Ok with that change.

Jakub

Re: [PATCH i386 4/8] [AVX512] [8/8] Add substed patterns: `sae-only for expand' subst.

2013-12-23 Thread Uros Bizjak

On Wed, Dec 18, 2013 at 2:16 PM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Rebased patch in the bottom.
 Adding the patch.

The same comment as in 7/8 applies here. The predicate is not
exclusive to expanders, should also be used in insn patterns. The name
of the predicate is a bit weird, please name it simply
round_saeonly_predicate.

OK with these changes.

Uros.

Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-23 Thread Allan Sandfeld Jensen

On Monday 23 December 2013, Allan Sandfeld Jensen wrote:
 On Monday 23 December 2013, H.J. Lu wrote:
  On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote:
   On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote:
 Sorry, I must have been looking at an older version, but as I said
 I already did enable it in the latest patch. (see
 http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html )

Sorry for causing another revision but we would like to stick with
btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore the
changes would be like
   
   I will need to make an updated patch to move the new ISAs to the end of
   the list anyway. I will send it in a few days to give AMD or Intel
   developers time to comment on the current version.
  
  I renamed Intel processor names. Please update your patch.  Here is my
  patch to add more Intel processor support.  You can add it to your
  patch.
 
 Updated patch attached. Rebased, fixed coding style, moved new ISA enums to
 the end and applied H.J.Lu's patch.
 
Fixed merging mistake that left haswell with SSE4_2 priority.

`Allan
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 206179)
+++ gcc/config/i386/i386.c	(working copy)
@@ -29970,16 +29970,21 @@
 P_SSE3,
 P_SSSE3,
 P_PROC_SSSE3,
-P_SSE4_a,
-P_PROC_SSE4_a,
+P_SSE4_A,
+P_PROC_SSE4_A,
 P_SSE4_1,
 P_SSE4_2,
 P_PROC_SSE4_2,
 P_POPCNT,
 P_AVX,
+P_PROC_AVX,
+P_FMA4,
+P_XOP,
+P_PROC_XOP,
+P_FMA,
+P_PROC_FMA,
 P_AVX2,
-P_FMA,
-P_PROC_FMA
+P_PROC_AVX2
   };
 
  enum feature_priority priority = P_ZERO;
@@ -29998,11 +30003,15 @@
   {sse, P_SSE},
   {sse2, P_SSE2},
   {sse3, P_SSE3},
+  {sse4a, P_SSE4_A},
   {ssse3, P_SSSE3},
   {sse4.1, P_SSE4_1},
   {sse4.2, P_SSE4_2},
   {popcnt, P_POPCNT},
   {avx, P_AVX},
+  {fma4, P_FMA4},
+  {xop, P_XOP},
+  {fma, P_FMA},
   {avx2, P_AVX2}
 };
 
@@ -30054,26 +30063,50 @@
 	  arg_str = nehalem;
 	  priority = P_PROC_SSE4_2;
 	  break;
-case PROCESSOR_SANDYBRIDGE:
-  arg_str = sandybridge;
-  priority = P_PROC_SSE4_2;
-  break;
+	case PROCESSOR_SANDYBRIDGE:
+	  arg_str = sandybridge;
+	  priority = P_PROC_AVX;
+	  break;
+	case PROCESSOR_HASWELL:
+	  arg_str = haswell;
+	  priority = P_PROC_AVX2;
+	  break;
 	case PROCESSOR_BONNELL:
 	  arg_str = bonnell;
 	  priority = P_PROC_SSSE3;
 	  break;
+	case PROCESSOR_SILVERMONT:
+	  arg_str = silvermont;
+	  priority = P_PROC_SSE4_2;
+	  break;
 	case PROCESSOR_AMDFAM10:
 	  arg_str = amdfam10h;
-	  priority = P_PROC_SSE4_a;
+	  priority = P_PROC_SSE4_A;
 	  break;
+	case PROCESSOR_BTVER1:
+	  arg_str = bobcat;
+	  priority = P_PROC_SSE4_A;
+	  break;
+	case PROCESSOR_BTVER2:
+	  arg_str = jaguar;
+	  priority = P_PROC_AVX;
+	  break;
 	case PROCESSOR_BDVER1:
 	  arg_str = bdver1;
-	  priority = P_PROC_FMA;
+	  priority = P_PROC_XOP;
 	  break;
 	case PROCESSOR_BDVER2:
 	  arg_str = bdver2;
 	  priority = P_PROC_FMA;
 	  break;
+	case PROCESSOR_BDVER3:
+	  arg_str = bdver3;
+	  priority = P_PROC_FMA;
+	  break;
+	case PROCESSOR_BDVER4:
+	  arg_str = bdver4;
+	  priority = P_PROC_AVX2;
+	  break;
 	}  
 	}
 
@@ -30938,6 +30971,10 @@
 F_SSE4_2,
 F_AVX,
 F_AVX2,
+F_SSE4_A,
+F_FMA4,
+F_XOP,
+F_FMA,
 F_MAX
   };
 
@@ -30955,6 +30992,10 @@
 M_AMDFAM10H,
 M_AMDFAM15H,
 M_INTEL_SILVERMONT,
+M_INTEL_COREI7_AVX,
+M_INTEL_CORE_AVX2,
+M_AMD_BOBCAT,
+M_AMD_JAGUAR,
 M_CPU_SUBTYPE_START,
 M_INTEL_COREI7_NEHALEM,
 M_INTEL_COREI7_WESTMERE,
@@ -30965,7 +31006,9 @@
 M_AMDFAM15H_BDVER1,
 M_AMDFAM15H_BDVER2,
 M_AMDFAM15H_BDVER3,
-M_AMDFAM15H_BDVER4
+M_AMDFAM15H_BDVER4,
+M_INTEL_COREI7_IVYBRIDGE,
+M_INTEL_CORE_HASWELL
   };
 
   static struct _arch_names_table
@@ -30983,16 +31026,24 @@
   {corei7, M_INTEL_COREI7},
   {nehalem, M_INTEL_COREI7_NEHALEM},
   {westmere, M_INTEL_COREI7_WESTMERE},
+  {corei7-avx, M_INTEL_COREI7_AVX},  
   {sandybridge, M_INTEL_COREI7_SANDYBRIDGE},
+  {ivybridge, M_INTEL_COREI7_IVYBRIDGE},
+  {core-avx2, M_INTEL_CORE_AVX2},
+  {haswell, M_INTEL_CORE_HASWELL},
+  {bonnell, M_INTEL_BONNELL},
+  {silvermont, M_INTEL_SILVERMONT},
   {amdfam10h, M_AMDFAM10H},
   {barcelona, M_AMDFAM10H_BARCELONA},
   {shanghai, M_AMDFAM10H_SHANGHAI},
   {istanbul, M_AMDFAM10H_ISTANBUL},
+  {bobcat, M_AMD_BOBCAT},  
   {amdfam15h, M_AMDFAM15H},
   {bdver1, M_AMDFAM15H_BDVER1},
   {bdver2, M_AMDFAM15H_BDVER2},
   {bdver3,

Re: [PATCH][x86] march aliases

2013-12-23 Thread H.J. Lu

On Mon, Dec 23, 2013 at 5:10 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Sun, Dec 22, 2013 at 11:11:12PM +0100, Uros Bizjak wrote:
 On Sun, Dec 22, 2013 at 8:28 PM, H.J. Lu hjl.to...@gmail.com wrote:

  Perhaps we should add sandybridge, ivybridge and haswell 
  aliases for
  corei7-avx, core-avx-i, core-avx2?  I mean, it is a nightmare 
  to remember
  which one has the i7 in and which doesn't even for me.

 Yes please, I think this is a good idea.

 I've added aliases for haswell, sandybridge, ivybridge, bonnell,
 nehalem and silvermont.

 Old names, like corei7, core-avx-i, atom, .. don't have precise
 description for the processor.  I think gcc driver should keep
 accepting them.  But they should be marked as undocumented
 or deprecated.  They should be removed from documentation.
   
How about we leave these as -march=... to refer to the architecture,
and reintroduce -mcpu= to refer to the exact cpu? Internally, the
-mcpu would use some architecture specific base PTA_ attributes (as
Jakub suggested) and would add some fine-tuning PTA_ attributes, based
on -mcpu selection. This way, -march stays as is, and can still be
used for some generally distributed binaries.
  
   -mcpu is problematic, because it means various things among different
   targets, and even on i?86/x86_64 it used to mean something already in 
   the
   past.  Sometimes -mcpu= is what -march= is now on i?86/x86_64, sometimes
   what -mtune= is.  I'd say we don't need to deprecate anything, just add 
   new
   aliases for the sometimes harder to remember names.  But everything just
   IMHO.
  
   Jakub
 
  There are many problems with the current -march=xxx/-mtune=xxx for
  Intel processors, which aren't faults of GCC:
 
  1. Atom processors can be Bonnell or Silvermont processors.  -mtune=atom
  may not optimize for the Atom CPU being targeted.
  2. Core I7 processors can be Nehalem, Westmere, Sandy Bridge, Ivy Bridge,
  Haswell or Broadwell. It is hard to tell which -mtune= to use for saying
  Core i7-3820QM.
  3. There are Core i3/i5, Xeon, Celeron, Pentium processors which aren't
  called Core I7.  They may be Nehalem, Westmere, Sandy Bridge, Ivy Bridge,
  Haswell or even Silvermont.
 
  We should move away from corei7, corei7-avx, core-avx-i, core-avx2, atom.
  Instead, we should use the actual processor names.  We must accept those
  old names.  But we should remove them from GCC manual to avoid any
  confusions.  This patch adds -march=/mtune={nehalem,westmere,sandybridge,
  ivybridge,haswell,broadwell,bonnell,silvermont}.  It also adds
  --with-arch=/--with-cpu= support as well as adds ivybridge, haswell,
  bonnell, silvermont to multi-arch function versioning.
 
  Any comments?
 
  This is the updated patch to add PTA_XXX as well as fix
 
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59580
 
  to properly check --with-arch=/--with-cpu= options. Now we only need
  to add a new processor to x86_64_archs, which will enable its
  --with-arch=/--with-cpu= support.
 
 
  H.J.
  ---
  gcc/
 
  2013-12-22   H.J. Lu  hongjiu...@intel.com
   Tocar Ilya  ilya.to...@intel.com
 
  * config.gcc (x86_archs): New variable.
  (x86_64_archs): Likewise.
  (x86_cpus): Likewise.
  Use $x86_archs, $x86_64_archs and $x86_cpus to check valid
  --with-arch/--with-cpu= options.
  Support --with-arch=/--with-cpu={nehalem,westmere,
  sandybridge,ivybridge,haswell,broadwell,bonnell,silvermont}.
 
  * config/i386/core2.md: Replace corei7 with nehalem.
 
  * config/i386/driver-i386.c (host_detect_local_cpu): Use nehalem,
  westmere, sandybridge, ivybridge, haswell, bonnell, silvermont
  for cpu names.
 
  * config/i386/i386-c.c (ix86_target_macros_internal): Replace
  PROCESSOR_COREI7, PROCESSOR_COREI7_AVX, PROCESSOR_ATOM,
  PROCESSOR_SLM with PROCESSOR_NEHALEM, PROCESSOR_SANDYBRIDGE,
  PROCESSOR_BONNELL, PROCESSOR_SILVERMONT.  Define
  __nehalem/__nehalem__, __sandybridge/__sandybridge__,
  __haswell/__haswell__, __tune_nehalem__, __tune_sandybridge__,
  __tune_haswell__, __bonnell/__bonnell__,
  __silvermont/__silvermont__, __tune_bonnell__,
  __tune_silvermont__.
 
  * config/i386/i386.c (m_COREI7): Renamed to ...
  (m_NEHALEM): This.
  (m_COREI7_AVX): Renamed to ...
  (m_SANDYBRIDGE): This.
  (m_ATOM): Renamed to ...
  (m_BONNELL): This.
  (m_SLM): Renamed to ...
  (m_SILVERMONT): This.
  (m_CORE_ALL): Updated.
  (cpu_names): Add nehalem, westmere, sandybridge,
  ivybridge, haswell, broadwell, bonnell, silvermont.
  (PTA_CORE2): New.
  (PTA_NEHALEM): Likewise.
  (PTA_WESTMERE): Likewise.
  (PTA_SANDYBRIDGE): Likewise.
  (PTA_IVYBRIDGE): Likewise.
  (PTA_HASWELL):

Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-23 Thread H.J. Lu

On Mon, Dec 23, 2013 at 8:57 AM, Allan Sandfeld Jensen
carew...@gmail.com wrote:
 On Monday 23 December 2013, Allan Sandfeld Jensen wrote:
 On Monday 23 December 2013, H.J. Lu wrote:
  On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote:
   On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote:
 Sorry, I must have been looking at an older version, but as I said
 I already did enable it in the latest patch. (see
 http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html )
   
Sorry for causing another revision but we would like to stick with
btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore the
changes would be like
  
   I will need to make an updated patch to move the new ISAs to the end of
   the list anyway. I will send it in a few days to give AMD or Intel
   developers time to comment on the current version.
 
  I renamed Intel processor names. Please update your patch.  Here is my
  patch to add more Intel processor support.  You can add it to your
  patch.

 Updated patch attached. Rebased, fixed coding style, moved new ISA enums to
 the end and applied H.J.Lu's patch.

 Fixed merging mistake that left haswell with SSE4_2 priority.

 `Allan

+M_INTEL_COREI7_AVX,
+M_INTEL_CORE_AVX2,

Do we need them?   M_INTEL_COREI7_AVX is the same
M_INTEL_COREI7_SANDYBRIDGE and M_INTEL_CORE_AVX2
is the same as M_INTEL_COREI7_HASWELL.

+M_INTEL_CORE_HASWELL

Please change M_INTEL_CORE_HASWELL to M_INTEL_COREI7_HASWELL.

+  {corei7-avx, M_INTEL_COREI7_AVX},
+  {core-avx2, M_INTEL_CORE_AVX2},

Why do we need them?

-- 
H.J.

[COMMITTED] RE: [PATCH] [followup to PR59569] new vect tests for store with negative step

2013-12-23 Thread Bingfeng Mei

Thanks. Committed with suggested change. 

Merry Christmas!
Bingfeng

-Original Message-
From: Jakub Jelinek [mailto:ja...@redhat.com] 
Sent: 23 December 2013 16:48
To: Bingfeng Mei
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] [followup to PR59569] new vect tests for store with 
negative step

On Mon, Dec 23, 2013 at 04:43:17PM +, Bingfeng Mei wrote:
 Here are two vectorization tests for store with negative step. This is
 follow-up to PR59569 fix, which contains two tests for ICE.  These tests
 are for vectorization tests and executable.  OK to commit?

--- testsuite/gcc.dg/vect/vect-neg-store-1.c(revision 0)
+++ testsuite/gcc.dg/vect/vect-neg-store-1.c(revision 0)
@@ -0,0 +1,27 @@
+/* { dg-require-effective-target vect_int } */
+#include stdlib.h
+
+__attribute__((noinline, noclone))
+void test1(short x[128])
+{
+int i;
+for (i=127; i=0; i--) {
+   x[i] = 1234;
+}
+}
+
+int main (void)
+{
+  short x[128];
+  int i;
+  test1 (x);
+  
+  for (i = 0; i  128; i++)
+   if (x[i] != 1234)
+ abort ();

Can you please change both tests so that the x array is say 128+32 elements
long instead of 128, you store some other pattern to the first 16 and last
16 elements in the array before calling test1 (do it say with asm ();
inside of the loop to avoid vectorization), call test1 on x + 16 and
afterwards verify that test1 didn't write anything before or after the
buffer?  Ok with that change.

Jakub

patch_vect_tests
Description: patch_vect_tests

Re: [PATCH 14/16] tree-ssa-loop-niter.c: use gimple_phi in a few places

2013-12-23 Thread David Malcolm

On Fri, 2013-12-13 at 12:13 -0500, Andrew MacLeod wrote:
 On 12/13/2013 10:58 AM, David Malcolm wrote:
{
  gimple stmt = SSA_NAME_DEF_STMT (x);
  @@ -2162,7 +2162,7 @@ chain_of_csts_start (struct loop *loop, tree x)
  if (gimple_code (stmt) == GIMPLE_PHI)
{
  if (bb == loop-header)
  -   return stmt;
  +   return stmt-as_a_gimple_phi ();

  return NULL;
}
  @@ -2195,10 +2195,10 @@ chain_of_csts_start (struct loop *loop, tree x)

   If such phi node exists, it is returned, otherwise NULL is returned.  
  */
 
 
 I dislike separating the checking of gimple_code () and the following 
 as_a.   I rather envisioned this sort of thing as being more of an 
 abstraction improvement if we never have to check gimple_code()...   
 Then you are also less locked into a specific implementation.
 
 So something more like:
 
 if (gimple_phi phi = stmt-dyncast_gimple_phi ())
{
   if (bb == loop-header)
 return phi;
}
 
 
 IMO anyway...

Thanks.

My goal is to use these stronger types (a) to move type-checking to
compile-time and (b) to (i hope) improve the readability of the code.
I'm not trying to switch away from gimple_code for the home-grown RTTI
per se.

However, given that you prefer the above style, I'm now opting to use
dyn_cast for the above kind of test in my ongoing work on this.

The other consideration is that I'm trying to minimize the invasiveness
of the patches, to avoid the amount of conflicts that will occur when
trying to merge this (for next stage1).  So I'm sometimes tactically
avoiding some constructs, e.g. to avoid needing to reindent large
suites.

FWIW I'm currently at 90 patches, and have reached some kind of halfway
point, with 162 gimple_foo_ access functions now taking a more concrete
type that gimple [1]; 159 to go.  That said, I think these accessors
are something of a surface detail - I'm more interested in such
concretizing of types *throughout* the middle-end, rather than just
focusing on the gimple_foo_ access functions; for example, I now have
the callgraph edge statements being gimple_call rather than just
gimple.  It's the latter kind of deeper change to typesafety that I'm
most excited about it.

Andrew: hopefully this is all compatible with your proposed changes to
types and expressions?  I'm trying to just touch the statements
themselves.

Dave

[1] including all of gimple_asm_*, gimple_bind_*, gimple_catch_*,
gimple_eh_dispatch_*, gimple_eh_else_*, gimple_omp_atomic_load_*,
gimple_omp_atomic_store_*, gimple_omp_continue_*, gimple_resx_*,
gimple_switch_*, gimple_transaction_*.

Re: [PATCH] Fix for PR59585

2013-12-23 Thread Yuri Gribov

Re-sending as plaintext.

Jakub wrote:
 Or is it stripping just the final newline at the end of output?

Exactly.

 Still sounds like a bug elsewhere to me.

Let me investigate this deeper tomorrow (rebuilding fresh Dg, etc.).
If it indeed turns out to be feature of current DejaGNU, workaround
may be the easiest solution.

Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-23 Thread Allan Sandfeld Jensen

On Monday 23 December 2013, H.J. Lu wrote:
 On Mon, Dec 23, 2013 at 8:57 AM, Allan Sandfeld Jensen
 
 carew...@gmail.com wrote:
  On Monday 23 December 2013, Allan Sandfeld Jensen wrote:
  On Monday 23 December 2013, H.J. Lu wrote:
   On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote:
On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote:
  Sorry, I must have been looking at an older version, but as I
  said I already did enable it in the latest patch. (see
  http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html )
 
 Sorry for causing another revision but we would like to stick with
 btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore
 the changes would be like

I will need to make an updated patch to move the new ISAs to the end
of the list anyway. I will send it in a few days to give AMD or
Intel developers time to comment on the current version.
   
   I renamed Intel processor names. Please update your patch.  Here is my
   patch to add more Intel processor support.  You can add it to your
   patch.
  
  Updated patch attached. Rebased, fixed coding style, moved new ISA enums
  to the end and applied H.J.Lu's patch.
  
  Fixed merging mistake that left haswell with SSE4_2 priority.
  
  `Allan
 
 +M_INTEL_COREI7_AVX,
 +M_INTEL_CORE_AVX2,
 
 Do we need them?   M_INTEL_COREI7_AVX is the same
 M_INTEL_COREI7_SANDYBRIDGE and M_INTEL_CORE_AVX2
 is the same as M_INTEL_COREI7_HASWELL.
 
M_INTEL_COREI7_AVX is the common model for both sandybridge and ivybridge. 
Matching PROCESSOR_SANDYBRIDGE, or march=corei7-avx. Similarly 
M_INTEL_CORE_AVX2 is the common model for haswell and broadwell, matching 
PROCESSOR_HASWELL or march=core-avx2.

 +M_INTEL_CORE_HASWELL
 
 Please change M_INTEL_CORE_HASWELL to M_INTEL_COREI7_HASWELL.
 
I used the name core_haswell to make its prefix match that of its model 
core_avx2 (as opposed to corei7_avx for instance).

 +  {corei7-avx, M_INTEL_COREI7_AVX},
 +  {core-avx2, M_INTEL_CORE_AVX2},
 
 Why do we need them?

Without the existence of these entries, __attribute__((target(corei7-avx))) 
or __attribute__((target(core-avx2)) failed to compile because of how 
parameters to attributes were verified.

Regards
`Allan

Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-23 Thread H.J. Lu

On Mon, Dec 23, 2013 at 10:33 AM, Allan Sandfeld Jensen
carew...@gmail.com wrote:
 On Monday 23 December 2013, H.J. Lu wrote:
 On Mon, Dec 23, 2013 at 8:57 AM, Allan Sandfeld Jensen

 carew...@gmail.com wrote:
  On Monday 23 December 2013, Allan Sandfeld Jensen wrote:
  On Monday 23 December 2013, H.J. Lu wrote:
   On Thu, Dec 19, 2013 at 11:20:39AM +0100, Allan Sandfeld Jensen wrote:
On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote:
  Sorry, I must have been looking at an older version, but as I
  said I already did enable it in the latest patch. (see
  http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html )

 Sorry for causing another revision but we would like to stick with
 btver1 and btver2 rather than BOBCAT or JAGUAR. Therefore
 the changes would be like
   
I will need to make an updated patch to move the new ISAs to the end
of the list anyway. I will send it in a few days to give AMD or
Intel developers time to comment on the current version.
  
   I renamed Intel processor names. Please update your patch.  Here is my
   patch to add more Intel processor support.  You can add it to your
   patch.
 
  Updated patch attached. Rebased, fixed coding style, moved new ISA enums
  to the end and applied H.J.Lu's patch.
 
  Fixed merging mistake that left haswell with SSE4_2 priority.
 
  `Allan

 +M_INTEL_COREI7_AVX,
 +M_INTEL_CORE_AVX2,

 Do we need them?   M_INTEL_COREI7_AVX is the same
 M_INTEL_COREI7_SANDYBRIDGE and M_INTEL_CORE_AVX2
 is the same as M_INTEL_COREI7_HASWELL.

 M_INTEL_COREI7_AVX is the common model for both sandybridge and ivybridge.
 Matching PROCESSOR_SANDYBRIDGE, or march=corei7-avx. Similarly
 M_INTEL_CORE_AVX2 is the common model for haswell and broadwell, matching
 PROCESSOR_HASWELL or march=core-avx2.

If you use

{corei7-avx, M_INTEL_COREI7_SANYBRIDGE},
{core-avx2, M_INTEL_COREI7_HASWELL},

will it cause any problems?  When there are both

__attribute__((target(corei7-avx)))

and

__attribute__((target(sandybridge)))

we should either issue an error or silently drop

__attribute__((target(corei7-avx)))

instead of generating to 2 identical copies of the same
function.

 +M_INTEL_CORE_HASWELL

 Please change M_INTEL_CORE_HASWELL to M_INTEL_COREI7_HASWELL.

 I used the name core_haswell to make its prefix match that of its model
 core_avx2 (as opposed to corei7_avx for instance).

We should remove all internal references to corei7-avx and
core-avx2 if possible.

 +  {corei7-avx, M_INTEL_COREI7_AVX},
 +  {core-avx2, M_INTEL_CORE_AVX2},

 Why do we need them?

 Without the existence of these entries, __attribute__((target(corei7-avx)))
 or __attribute__((target(core-avx2)) failed to compile because of how
 parameters to attributes were verified.


-- 
H.J.

Fix use of stack-pointer-register as a temporary for CRIS

2013-12-23 Thread Hans-Peter Nilsson

The circumstances are a bit odd; the stack-pointer (sp) is never
the target for a direct assignment in ordinary generated code.
Still, this happens for gcc.dg/pr50251.c, calling
__builtin_stack_restore.  There's a bug in several define_splits
in the CRIS port, in that the destination of the split insn is
used as a temporary, so sp is set to something unusable as a
stack-pointer.  You don't want that in a context where
interrupts use the same stack as the running program; there's no
red-zone or anything.  Though, it *would* be valid for contexts
where the user stack is not the system (interrupt) stack, but
introducing that distinction is not worthwhile.

I'll mark this with middle-end/59584 only because it makes a
nice test-case should anyone want to work on the general bug
noticed there (revert the commit locally, observe ICE for
gcc.dg/pr50251.c).  The general bug for PR59584 is that GCC
can't handle fixing up the REG_ARGS_SIZE note being on a direct
assignment to the stack-pointer, therefore no define_split must
match it.  This patch just removes the define_split; the bug is
likely to hit other targets, when __builtin_stack_restore is
called.

PS. I wish we have a name field for define_splits...  I don't
think a string would collide, syntactically.  Maybe later.

Tested cris-elf, makes gcc.dg/pr50251.c pass again.

PR middle-end/59584
* config/cris/predicates.md (cris_nonsp_register_operand):
New define_predicate.
* config/cris/cris.md: Replace register_operand with
cris_nonsp_register_operand for destinations in all
define_splits where a register is set more than once.

Index: gcc/config/cris/cris.md
===
--- gcc/config/cris/cris.md (revision 206176)
+++ gcc/config/cris/cris.md (working copy)
@@ -758,7 +758,7 @@ (define_split
  (match_operand:SI 1 const_int_operand ))
 (match_operand:SI 2 register_operand ))])
  (match_operand 3 register_operand ))
- (set (match_operand:SI 4 register_operand )
+ (set (match_operand:SI 4 cris_nonsp_register_operand )
  (plus:SI (mult:SI (match_dup 0)
(match_dup 1))
   (match_dup 2)))])]
@@ -859,7 +859,7 @@ (define_split
 (match_operand:SI 0 cris_bdap_operand )
 (match_operand:SI 1 cris_bdap_operand ))])
  (match_operand 2 register_operand ))
- (set (match_operand:SI 3 register_operand )
+ (set (match_operand:SI 3 cris_nonsp_register_operand )
  (plus:SI (match_dup 0) (match_dup 1)))])]
   reload_completed  reg_overlap_mentioned_p (operands[3], operands[2])
   [(set (match_dup 4) (match_dup 2))
@@ -3960,7 +3960,7 @@ (define_expand casesi
 ;; up.
 
 (define_split
-  [(set (match_operand 0 register_operand )
+  [(set (match_operand 0 cris_nonsp_register_operand )
(match_operator
 4 cris_operand_extend_operator
 [(match_operand 1 register_operand )
@@ -3990,7 +3990,7 @@ (define_split
 ;; Call this op-extend-split-rx=rz
 
 (define_split
-  [(set (match_operand 0 register_operand )
+  [(set (match_operand 0 cris_nonsp_register_operand )
(match_operator
 4 cris_plus_or_bound_operator
 [(match_operand 1 register_operand )
@@ -4018,7 +4018,7 @@ (define_split
 ;; Call this op-extend-split-swapped
 
 (define_split
-  [(set (match_operand 0 register_operand )
+  [(set (match_operand 0 cris_nonsp_register_operand )
(match_operator
 4 cris_plus_or_bound_operator
 [(match_operator
@@ -4044,7 +4044,7 @@ (define_split
 ;; bound.  Call this op-extend-split-swapped-rx=rz.
 
 (define_split
-  [(set (match_operand 0 register_operand )
+  [(set (match_operand 0 cris_nonsp_register_operand )
(match_operator
 4 cris_plus_or_bound_operator
 [(match_operator
@@ -4075,7 +4075,7 @@ (define_split
 ;; Call this op-extend.
 
 (define_split
-  [(set (match_operand 0 register_operand )
+  [(set (match_operand 0 cris_nonsp_register_operand )
(match_operator
 3 cris_orthogonal_operator
 [(match_operand 1 register_operand )
@@ -4099,7 +4099,7 @@ (define_split
 ;; Call this op-split-rx=rz
 
 (define_split
-  [(set (match_operand 0 register_operand )
+  [(set (match_operand 0 cris_nonsp_register_operand )
(match_operator
 3 cris_commutative_orth_op
 [(match_operand 2 memory_operand )
@@ -4123,7 +4123,7 @@ (define_split
 ;; Call this op-split-swapped.
 
 (define_split
-  [(set (match_operand 0 register_operand )
+  [(set (match_operand 0 cris_nonsp_register_operand )
(match_operator
 3 cris_commutative_orth_op
 [(match_operand 1 register_operand )
@@ -4146,7 +4146,7 @@ (define_split
 ;; Call this op-split-swapped-rx=rz.
 
 (define_split
-  [(set (match_operand 0 register_operand )
+  [(set (match_operand 0 cris_nonsp_register_operand )
(match_operator
 3

Committed: fix PR target/59203, typo in cris.c

2013-12-23 Thread Hans-Peter Nilsson

Spotted by David Binderman and cppcheck, thanks.  The
interesting cases wouldn't be exposed by a cris-elf build, but I
made a regtest-run nonetheless: the fix has actually been in our
local tree for quite some time together with TLS for CRIS v32 so
I'm not worried about fallout.  (Upstreaming that?  Hm... one
excuse I use is that I've been waiting for TLS for CRIS v10 to
materialize for the Linux kernel, along the v32 lines but using
$IRP, but that never happened.)

PR target/59203
* config/cris/cris.c (cris_pic_symbol_type_of): Fix typo,
checking t1 twice instead of t1 and t2 respectively.

Index: gcc/config/cris/cris.c
===
--- gcc/config/cris/cris.c  (revision 206176)
+++ gcc/config/cris/cris.c  (working copy)
@@ -2493,7 +2493,7 @@ cris_pic_symbol_type_of (const_rtx x)
 
gcc_assert (t1 == cris_no_symbol || t2 == cris_no_symbol);
 
-   if (t1 == cris_got_symbol || t1 == cris_got_symbol)
+   if (t1 == cris_got_symbol || t2 == cris_got_symbol)
  return cris_got_symbol_needing_fixup;
 
return t1 != cris_no_symbol ? t1 : t2;

brgds, H-P

C++ PATCH for c++/59349 (ICE with empty lambda init-capture initializer)

2013-12-23 Thread Jason Merrill

We need to handle getting NULL_TREE for the capture initializer, so that 
we don't crash when trying to do things like look at its type.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 135f0f322516ce986ed13a214ca9351bd1963749
Author: Jason Merrill ja...@redhat.com
Date:   Mon Dec 23 15:05:00 2013 -0500

	PR c++/59349
	* parser.c (cp_parser_lambda_introducer): Handle empty init.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 2a2cbf0..4ef0f05 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -8898,6 +8898,11 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr)
 	  capture_init_expr = cp_parser_initializer (parser, direct,
 		 non_constant);
 	  explicit_init_p = true;
+	  if (capture_init_expr == NULL_TREE)
+	{
+	  error (empty initializer for lambda init-capture);
+	  capture_init_expr = error_mark_node;
+	}
 	}
   else
 	{
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-init7.C b/gcc/testsuite/g++.dg/cpp1y/lambda-init7.C
new file mode 100644
index 000..ad152cf
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-init7.C
@@ -0,0 +1,6 @@
+// PR c++/59349
+// { dg-options -std=c++1y }
+
+int foo () {
+  [bar()]{};			// { dg-error empty initializer }
+}

C++ PATCH for c++/59271 (ICE with polymorphic lambda and VLA)

2013-12-23 Thread Jason Merrill

This testcase was crashing in strip_typedefs because it uses 
build_cplus_array_type, while the original type was built with the 
generic build_array_type, and the two functions work differently within 
a template such that we violated an assert in strip_typedefs.  Fixed by 
using build_cplus_array_type consistently.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 9e0c771b79ce3c143fffae2fd09ecdc6f88041d9
Author: Jason Merrill ja...@redhat.com
Date:   Mon Dec 23 15:20:38 2013 -0500

	PR c++/59271
	* lambda.c (build_capture_proxy): Use build_cplus_array_type.

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 24aa2c5..bd8df1d 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -377,8 +377,8 @@ build_capture_proxy (tree member)
   tree ptr = build_simple_component_ref (object, field);
   field = next_initializable_field (DECL_CHAIN (field));
   tree max = build_simple_component_ref (object, field);
-  type = build_array_type (TREE_TYPE (TREE_TYPE (ptr)),
-			   build_index_type (max));
+  type = build_cplus_array_type (TREE_TYPE (TREE_TYPE (ptr)),
+ build_index_type (max));
   type = build_reference_type (type);
   REFERENCE_VLA_OK (type) = true;
   object = convert (type, ptr);
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-vla1.C b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-vla1.C
new file mode 100644
index 000..556722c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-vla1.C
@@ -0,0 +1,24 @@
+// PR c++/59271
+// { dg-options -std=c++1y }
+
+extern C int printf (const char *, ...);
+
+void f(int n)
+{
+  int  a[n];
+
+  for (auto i : a)
+{
+  i = i - a;
+}
+
+  [a] (auto m)
+{
+  for (auto i : a)
+	{
+	  printf (%d, i);
+	}
+
+  return m;
+};
+}

[PING][GOMP4][PATCH] SIMD-enabled functions (formerly Elemental functions) for C++

2013-12-23 Thread Iyer, Balaji V

Ping!

-Balaji V. Iyer.

 -Original Message-
 From: Iyer, Balaji V
 Sent: Thursday, December 19, 2013 1:12 PM
 To: Jakub Jelinek
 Cc: 'Aldy Hernandez (al...@redhat.com)'; 'gcc-patches@gcc.gnu.org'
 Subject: RE: [GOMP4][PATCH] SIMD-enabled functions (formerly Elemental
 functions) for C++

 Hi Jakub,
   Attached, please find a fixed patch. I have answered your questions
 below. Is this OK for trunk?

 Here are the ChangeLog entries:
 Gcc/cp/ChangeLog
 2013-12-19  Balaji V. Iyer  balaji.v.i...@intel.com

 * parser.c (cp_parser_direct_declarator): When Cilk Plus is enabled
 see if there is an attribute after function decl.  If so, then
 parse them now.
 (cp_parser_late_return_type_opt): Handle parsing of Cilk Plus SIMD
 enabled function late parsing.
 (cp_parser_gnu_attribute_list): Parse all the tokens for the vector
 attribute for a SIMD-enabled function.
 (cp_parser_omp_all_clauses): Skip parsing to the end of pragma when
 the function is used by SIMD-enabled function (indicated by NULL
 pragma token).   Added 3 new clauses: PRAGMA_CILK_CLAUSE_MASK,
 PRAGMA_CILK_CLAUSE_NOMASK and
 PRAGMA_CILK_CLAUSE_VECTORLENGTH
 (cp_parser_cilk_simd_vectorlength): Modified this function to handle
 vectorlength clause in SIMD-enabled function and #pragma SIMD's
 vectorlength clause.  Added a new bool parameter to differentiate
 between the two.
 (cp_parser_cilk_simd_fn_vector_attrs): New function.
 (is_cilkplus_vector_p): Likewise.
 (cp_parser_late_parsing_elem_fn_info): Likewise.
 (cp_parser_omp_clause_name): Added a check for mask, nomask
 and vectorlength clauses when Cilk Plus is enabled.
 (cp_parser_omp_clause_linear): Added a new parameter of type bool
 and emit a sorry message when step size is a parameter.
 * parser.h (cp_parser::cilk_simd_fn_info): New field.

 Testsuite/ChangeLog
 2013-12-19  Balaji V. Iyer  balaji.v.i...@intel.com

 * g++.dg/cilk-plus/cilk-plus.exp: Called the C/C++ common tests for
 SIMD enabled function.
 * g++.dg/cilk-plus/ef_test.C: New test.
 * c-c++-common/cilk-plus/vlength_errors.c: Added new dg-error tags
 to differenciate C error messages from C++ ones.

 Thanks,

 Balaji V. Iyer.

  -Original Message-
  From: Jakub Jelinek [mailto:ja...@redhat.com]
  Sent: Thursday, December 19, 2013 2:23 AM
  To: Iyer, Balaji V
  Cc: 'Aldy Hernandez (al...@redhat.com)'; 'gcc-patches@gcc.gnu.org'
  Subject: Re: [GOMP4][PATCH] SIMD-enabled functions (formerly
 Elemental
  functions) for C++

  On Wed, Dec 18, 2013 at 11:36:04PM +, Iyer, Balaji V wrote:
   --- a/gcc/cp/decl2.c
   +++ b/gcc/cp/decl2.c
   @@ -1124,6 +1124,10 @@ is_late_template_attribute (tree attr, tree
 decl)
   is_attribute_p (omp declare simd, name))
return true;

   +  /* Ditto as above for Cilk Plus SIMD-enabled function attributes.
   + */  if (flag_enable_cilkplus  is_attribute_p (cilk simd
   + function,
  name))
   +return true;

  Why?  It doesn't have any argument, why it should be processed late?

 Fixed.

   @@ -17097,6 +17102,14 @@ cp_parser_direct_declarator (cp_parser*
   parser,

   attrs = cp_parser_std_attribute_spec_seq (parser);

   +   /* In here, we handle cases where attribute is used after
   +  the function declaration.  For example:
   +  void func (int x) __attribute__((vector(..)));  */
   +   if (flag_enable_cilkplus
   +cp_lexer_next_token_is_keyword (parser-lexer,
   +  RID_ATTRIBUTE))
   + attrs = chainon (cp_parser_gnu_attributes_opt (parser),
   +  attrs);
   late_return = (cp_parser_late_return_type_opt
  (parser, declarator,
   memfn ? cv_quals : -1));

  Doesn't this change the grammar (for all attributes, not just Cilk+
  specific
  ones) just based on whether -fcilkplus has been specified or not?

 OK. Fixed this by making it parse tentatively (sort of similar to how you 
 parse
 attributes after labels (line #9584))

   @@ -17820,10 +17833,14 @@ cp_parser_late_return_type_opt
  (cp_parser* parser, cp_declarator *declarator,
   declarator
   declarator-kind == cdk_id);

   +  bool cilk_simd_fn_vector_p = (parser-cilk_simd_fn_info
   + declarator
   + declarator-kind == cdk_id);

  Formatting looks wrong, put = on the next line and align  right
  below parser.

 Fixed.

   +
   +cp_omp_declare_simd_data info;

  Global var?  Why?  Isn't heap or GC allocation better?

 Fixed. Replaced it with XNEW and XDELETE combinations instead of setting
 the address of a global value.

   +  /* The vectorlength clause

Re: [PATCH i386 4/8] [AVX512] [2/n] Add substed patterns: mask scalar subst.

2013-12-23 Thread Kirill Yukhin

 Patch attached.

 Ok for trunk?


Just noticed Uros's input about predicates. So, ok with fix of predicate?

Re: [PATCH] Fix for PR59585

2013-12-23 Thread Yury Gribov


Yury wrote:
 Still sounds like a bug elsewhere to me.

 Let me investigate this deeper tomorrow (rebuilding fresh Dg, etc.).

So I've double-checked that this is a problem with trunk DejaGNU rsh.exp 
script removing trailing newline from test output:


# Delete one trailing \n because that is what `exec' will do and we 
want

# to behave identical to it.
regsub \n$ $output  output

I can report this to DejaGNU mailing list but even if they agree to fix 
we'll still have to do something about legacy Dg installations. I 
suggest to work around by removing trailing newline as suggested by 
original patch (or maybe replacing it with $ ?).


What's your opinion?

-Y

Re: [PATCH i386 4/8] [AVX512] [2/n] Add substed patterns: mask scalar subst.

2013-12-23 Thread Uros Bizjak

On Tue, Dec 24, 2013 at 5:57 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote:
 Patch attached.

 Ok for trunk?


 Just noticed Uros's input about predicates. So, ok with fix of predicate?

Please retest and repost the patch with the predicate fix.

Looks good otherwise, with a couple of minor changes below:

diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vextractf32x4-2.c
b/gcc/testsuite/gcc.target/i386/avx512f-vextractf32x4-2.c
new file mode 100644
index 000..26d7c3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vextractf32x4-2.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-options -O2 -mavx512f -DAVX512F } */

Please move defines from options to source.

+(define_subst_attr round_prefix round vex evex)
 (define_subst_attr round_mode512bit_condition round 1
(GET_MODE (operands[0]) == V16SFmode || GET_MODE (operands[0]) ==
V8DFmode))
 (define_subst_attr round_modev4sf_condition round 1 (GET_MODE
(operands[0]) == V4SFmode))

While here, can you also change conditions to static checks
(MODEmode == ) ?

Uros.

89 matches

Mail list logo