date:20160617

[Bug inline-asm/71572] ICE on valid code on x86_64-linux-gnu: in force_constant_size, at gimplify.c:671

2016-06-17 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71572

Andrew Pinski  changed:

   What|Removed |Added

  Component|c   |inline-asm

--- Comment #1 from Andrew Pinski  ---
This might be invalid code ...

Re: [PATCH] Change PRED_LOOP_EXIT from 92 to 85.

2016-06-17 Thread Andrew Pinski

On Fri, Jun 17, 2016 at 7:29 AM, Martin Liška  wrote:
> Hello.
>
> After we've recently applied various changes (fixes) to predict.c, SPEC2006
> shows that PRED_LOOP_EXIT value should be amended.


This caused a 1% decrease of performance on coremarks on
aarch64-linux-gnu on ThunderX.

Thanks,
Andrew

>
> Survives regression tests & bootstrap on x86_64-linux.
> Pre-approved by Honza, installed as r237556.
>
> Thanks,
> Martin

[Bug c++/71577] New: ICE on invalid C++11 code (with extra struct initializer) on x86_64-linux-gnu: in digest_init_r, at cp/typeck2.c:1117

2016-06-17 Thread su at cs dot ucdavis.edu

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71577

Bug ID: 71577
   Summary: ICE on invalid C++11 code (with extra struct
initializer) on x86_64-linux-gnu: in digest_init_r, at
cp/typeck2.c:1117
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu
  Target Milestone: ---

The following C++11 code causes an ICE when compiled with the current GCC trunk
on x86_64-linux-gnu in both 32-bit and 64-bit modes.  

It is a regression from 6.1.x. 


$ g++-trunk -v
Using built-in specs.
COLLECT_GCC=g++-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160617 (experimental) [trunk revision 237557] (GCC) 
$ 
$ g++-6.1 -c small.cpp
small.cpp:1:36: error: too many initializers for ‘’
 struct { int a; } s1, s2 = { s1, 0 };
^
small.cpp:1:36: error: cannot convert ‘’ to ‘int’ in
initialization
$ 
$ g++-trunk -c small.cpp
small.cpp:1:36: error: too many initializers for ‘’
 struct { int a; } s1, s2 = { s1, 0 };
^
small.cpp:1:36: internal compiler error: in digest_init_r, at cp/typeck2.c:1117
0x726db8 digest_init_r
../../gcc-source-trunk/gcc/cp/typeck2.c:1117
0x72839a digest_init_flags
../../gcc-source-trunk/gcc/cp/typeck2.c:1167
0x72839a store_init_value(tree_node*, tree_node*, vec<tree_node*, va_gc,
vl_embed>**, int)
../../gcc-source-trunk/gcc/cp/typeck2.c:796
0x687abc check_initializer
../../gcc-source-trunk/gcc/cp/decl.c:6193
0x6b1c0d cp_finish_decl(tree_node*, tree_node*, bool, tree_node*, int)
../../gcc-source-trunk/gcc/cp/decl.c:6851
0x7ac1a7 cp_parser_init_declarator
../../gcc-source-trunk/gcc/cp/parser.c:18697
0x7ac9d9 cp_parser_simple_declaration
../../gcc-source-trunk/gcc/cp/parser.c:12378
0x7acce1 cp_parser_block_declaration
../../gcc-source-trunk/gcc/cp/parser.c:12246
0x7b60c0 cp_parser_declaration
../../gcc-source-trunk/gcc/cp/parser.c:12143
0x7b4b94 cp_parser_declaration_seq_opt
../../gcc-source-trunk/gcc/cp/parser.c:12022
0x7b4ec8 cp_parser_translation_unit
../../gcc-source-trunk/gcc/cp/parser.c:4324
0x7b4ec8 c_parse_file()
../../gcc-source-trunk/gcc/cp/parser.c:37486
0x918e02 c_common_parse_file()
../../gcc-source-trunk/gcc/c-family/c-opts.c:1070
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
$ 




// valid in C++11 & okay: struct { int a; } s1, s2 = { s1 }; 
struct { int a; } s1, s2 = { s1, 0 };

[Bug c++/71576] New: ICE on valid C++11 code (with xvalue and bitfield) on x86_64-linux-gnu: in build_target_expr, at cp/tree.c:385

2016-06-17 Thread su at cs dot ucdavis.edu

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71576

Bug ID: 71576
   Summary: ICE on valid C++11 code (with xvalue and bitfield) on
x86_64-linux-gnu: in build_target_expr, at
cp/tree.c:385
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu
  Target Milestone: ---

The following C++11 code causes an ICE when compiled with the current GCC trunk
on x86_64-linux-gnu in both 32-bit and 64-bit modes.  

This is a regression from 6.1.x.


$ g++-trunk -v
Using built-in specs.
COLLECT_GCC=g++-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160617 (experimental) [trunk revision 237557] (GCC) 
$ 
$ g++-6.1 -c small.cpp
$ 
$ g++-trunk -c small.cpp
small.cpp: In function ‘void foo()’:
small.cpp:10:26: internal compiler error: in build_target_expr, at
cp/tree.c:385
   int & = foo < A > ().i;
  ^
0x83597f build_target_expr
../../gcc-source-trunk/gcc/cp/tree.c:379
0x66d066 convert_like_real
../../gcc-source-trunk/gcc/cp/call.c:6769
0x67934b initialize_reference(tree_node*, tree_node*, int, int)
../../gcc-source-trunk/gcc/cp/call.c:10054
0x688224 grok_reference_init
../../gcc-source-trunk/gcc/cp/decl.c:5186
0x688224 check_initializer
../../gcc-source-trunk/gcc/cp/decl.c:6085
0x6b1c0d cp_finish_decl(tree_node*, tree_node*, bool, tree_node*, int)
../../gcc-source-trunk/gcc/cp/decl.c:6851
0x7ac1a7 cp_parser_init_declarator
../../gcc-source-trunk/gcc/cp/parser.c:18697
0x7ac9d9 cp_parser_simple_declaration
../../gcc-source-trunk/gcc/cp/parser.c:12378
0x7acce1 cp_parser_block_declaration
../../gcc-source-trunk/gcc/cp/parser.c:12246
0x7ad738 cp_parser_declaration_statement
../../gcc-source-trunk/gcc/cp/parser.c:11858
0x7aa30b cp_parser_statement
../../gcc-source-trunk/gcc/cp/parser.c:10526
0x7aac2c cp_parser_statement_seq_opt
../../gcc-source-trunk/gcc/cp/parser.c:10804
0x7aad1f cp_parser_compound_statement
../../gcc-source-trunk/gcc/cp/parser.c:10758
0x7aaecf cp_parser_function_body
../../gcc-source-trunk/gcc/cp/parser.c:20696
0x7aaecf cp_parser_ctor_initializer_opt_and_function_body
../../gcc-source-trunk/gcc/cp/parser.c:20732
0x7ab971 cp_parser_function_definition_after_declarator
../../gcc-source-trunk/gcc/cp/parser.c:25415
0x7ac685 cp_parser_function_definition_from_specifiers_and_declarator
../../gcc-source-trunk/gcc/cp/parser.c:25327
0x7ac685 cp_parser_init_declarator
../../gcc-source-trunk/gcc/cp/parser.c:18468
0x7ac9d9 cp_parser_simple_declaration
../../gcc-source-trunk/gcc/cp/parser.c:12378
0x7acce1 cp_parser_block_declaration
../../gcc-source-trunk/gcc/cp/parser.c:12246
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
$ 


--


template < typename T > T && foo ();

struct A 
{
  int i:5;
};

void foo ()
{
  int & = foo < A > ().i;
}

[Bug regression/71575] internal compiler error: in copy_cond_phi_nodes, at graphite-isl-ast-to-gimple.c:2500

2016-06-17 Thread lluixhi at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71575

--- Comment #1 from Aric Belsito  ---
Forgot this in OP:

$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/x86_64-gentoo-linux-musl/gcc-bin/6.1.0/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-gentoo-linux-musl/6.1.0/lto-wrapper
Target: x86_64-gentoo-linux-musl
Configured with: /var/tmp/portage/sys-devel/gcc-6.1.0/work/gcc-6.1.0/configure
--host=x86_64-gentoo-linux-musl --build=x86_64-gentoo-linux-musl --prefix=/usr
--bindir=/usr/x86_64-gentoo-linux-musl/gcc-bin/6.1.0
--includedir=/usr/lib/gcc/x86_64-gentoo-linux-musl/6.1.0/include
--datadir=/usr/share/gcc-data/x86_64-gentoo-linux-musl/6.1.0
--mandir=/usr/share/gcc-data/x86_64-gentoo-linux-musl/6.1.0/man
--infodir=/usr/share/gcc-data/x86_64-gentoo-linux-musl/6.1.0/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-gentoo-linux-musl/6.1.0/include/g++-v6
--with-python-dir=/share/gcc-data/x86_64-gentoo-linux-musl/6.1.0/python
--enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt
--disable-werror
--with-system-zlib --enable-nls --without-included-gettext --disable-symvers
libat_cv_have_ifunc=no --enable-checking=release
--with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo Hardened 6.1.0
p1.1' --enable-esp --enable-libstdcxx-time --disable-libstdcxx-pch
--enable-shared --enable-threads=posix --enable-__cxa_atexit --disable-multilib
--with-multilib-list=m64 --disable-altivec --disable-fixed-point
--enable-targets=all --disable-libgcj --enable-libgomp --disable-libmudflap
--disable-libssp --disable-libcilkrts --disable-libmpx --disable-vtable-verify
--disable-libvtv --enable-lto --with-isl --disable-isl-version-check
--disable-libsanitizer --enable-default-pie --enable-default-ssp
Thread model: posix
gcc version 6.1.0 (Gentoo Hardened 6.1.0 p1.1)

[Bug regression/71575] New: internal compiler error: in copy_cond_phi_nodes, at graphite-isl-ast-to-gimple.c:2500

2016-06-17 Thread lluixhi at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71575

Bug ID: 71575
   Summary: internal compiler error: in copy_cond_phi_nodes, at
graphite-isl-ast-to-gimple.c:2500
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: regression
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lluixhi at gmail dot com
  Target Milestone: ---

Reduced Testcase:
void w(int x, double *y)
{
int i, j;
double a;
double c[32];

for (i = 0; i < x; i++) {
for (j = 0; j < x - i; j++) {
c[j] = y[i];
}
y[i] = a;
a += c[0] + y[i];
}
}

void v(int x, double *y)
{
w(x, y);
}

on amd64 with -O2 -floop-nest-optimize

test.c: In function 'w':
test.c:1:6: internal compiler error: in copy_cond_phi_nodes, at
graphite-isl-ast-to-gimple.c:2500
 void w(int x, double *y)
  ^
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

I have the pull request from pr69068 applied.

[Bug c/71574] New: ICE on code with alloc_align attribute on x86_64-linux-gnu: in default_conversion, at c/c-typeck.c:2126

2016-06-17 Thread su at cs dot ucdavis.edu

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71574

Bug ID: 71574
   Summary: ICE on code with alloc_align attribute on
x86_64-linux-gnu: in default_conversion, at
c/c-typeck.c:2126
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu
  Target Milestone: ---

The following code causes an ICE when compiled with the current gcc trunk on
x86_64-linux-gnu in both 32-bit and 64-bit modes.

It ICEs all GCC versions 5.x and later. 


$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160617 (experimental) [trunk revision 237557] (GCC) 
$ 
$ gcc-trunk -c small.c
small.c:2:1: internal compiler error: in default_conversion, at
c/c-typeck.c:2118
 int fn2 () __attribute__ ((alloc_align (fn1)));
 ^~~
0x66b0dd default_conversion(tree_node*)
../../gcc-source-trunk/gcc/c/c-typeck.c:2118
0x6d4cf0 handle_alloc_align_attribute
../../gcc-source-trunk/gcc/c-family/c-common.c:8365
0x641441 decl_attributes(tree_node**, tree_node*, int)
../../gcc-source-trunk/gcc/attribs.c:551
0x65991a start_decl(c_declarator*, c_declspecs*, bool, tree_node*)
../../gcc-source-trunk/gcc/c/c-decl.c:4545
0x6bdff5 c_parser_declaration_or_fndef
../../gcc-source-trunk/gcc/c/c-parser.c:1963
0x6c7f95 c_parser_external_declaration
../../gcc-source-trunk/gcc/c/c-parser.c:1549
0x6c8829 c_parser_translation_unit
../../gcc-source-trunk/gcc/c/c-parser.c:1430
0x6c8829 c_parse_file()
../../gcc-source-trunk/gcc/c/c-parser.c:17930
0x72b1b2 c_common_parse_file()
../../gcc-source-trunk/gcc/c-family/c-opts.c:1070
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
$ 


-


int fn1 ();
int fn2 () __attribute__ ((alloc_align (fn1)));

[Bug c/71573] New: ICE on invalid C code on x86_64-linux-gnu (tree check: expected function_decl, have var_decl in implicitly_declare)

2016-06-17 Thread chengniansun at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71573

Bug ID: 71573
   Summary: ICE on invalid C code on x86_64-linux-gnu (tree check:
expected function_decl, have var_decl in
implicitly_declare)
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: chengniansun at gmail dot com
  Target Milestone: ---

The following code crashes the trunk on on x86_64-linux-gnu in both 32-bit and
64-bit modes. 

gcc-6.1 and all older versions do not crash. This should be a recent
regression.


$: gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160617 (experimental) [trunk revision 237557] (GCC) 
$: 
$: gcc-trunk small.c
small.c: In function ‘f2’:
small.c:4:3: internal compiler error: tree check: expected function_decl, have
var_decl in implicitly_declare, at c/c-decl.c:3313
   t(g);
   ^
0xe6b3bc tree_check_failed(tree_node const*, char const*, int, char const*,
...)
../../gcc-source-trunk/gcc/tree.c:9752
0x64dc01 tree_check
../../gcc-source-trunk/gcc/tree.h:3030
0x64dc01 implicitly_declare(unsigned int, tree_node*)
../../gcc-source-trunk/gcc/c/c-decl.c:3313
0x66ba52 build_external_ref(unsigned int, tree_node*, int, tree_node**)
../../gcc-source-trunk/gcc/c/c-typeck.c:2728
0x6a1457 c_parser_postfix_expression
../../gcc-source-trunk/gcc/c/c-parser.c:7497
0x6a1c2a c_parser_unary_expression
../../gcc-source-trunk/gcc/c/c-parser.c:6942
0x6a2a3a c_parser_cast_expression
../../gcc-source-trunk/gcc/c/c-parser.c:6771
0x6a2c44 c_parser_binary_expression
../../gcc-source-trunk/gcc/c/c-parser.c:6580
0x6a38f5 c_parser_conditional_expression
../../gcc-source-trunk/gcc/c/c-parser.c:6351
0x6a3f70 c_parser_expr_no_commas
../../gcc-source-trunk/gcc/c/c-parser.c:6268
0x6a4672 c_parser_expression
../../gcc-source-trunk/gcc/c/c-parser.c:8463
0x6a50d9 c_parser_expression_conv
../../gcc-source-trunk/gcc/c/c-parser.c:8496
0x6bb408 c_parser_statement_after_labels
../../gcc-source-trunk/gcc/c/c-parser.c:5287
0x6bd2ab c_parser_compound_statement_nostart
../../gcc-source-trunk/gcc/c/c-parser.c:4861
0x6bdb3e c_parser_compound_statement
../../gcc-source-trunk/gcc/c/c-parser.c:4696
0x6bed67 c_parser_declaration_or_fndef
../../gcc-source-trunk/gcc/c/c-parser.c:2105
0x6c7f95 c_parser_external_declaration
../../gcc-source-trunk/gcc/c/c-parser.c:1549
0x6c8829 c_parser_translation_unit
../../gcc-source-trunk/gcc/c/c-parser.c:1430
0x6c8829 c_parse_file()
../../gcc-source-trunk/gcc/c/c-parser.c:17930
0x72b1b2 c_common_parse_file()
../../gcc-source-trunk/gcc/c-family/c-opts.c:1070
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
$: 
$: cat small.c
void f1() { extern int t; }
void f2() {
  int g;
  t(g);
}
$:

[Bug target/71571] [CRIS] Multiple inheritance non-virtual PIC thunk causes crash

2016-06-17 Thread hp at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71571

Hans-Peter Nilsson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-06-17
 Ever confirmed|0   |1

[Bug target/71571] [CRIS] Multiple inheritance non-virtual PIC thunk causes crash

2016-06-17 Thread hp at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71571

Hans-Peter Nilsson  changed:

   What|Removed |Added

 CC||hp at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |hp at gcc dot gnu.org

--- Comment #3 from Hans-Peter Nilsson  ---
Confirmed by inspection of cris.c:cris_asm_output_mi_thunk; -fpic or -fPIC is
required and also crisv32-axis-linux-gnu "only".

[Bug c/71572] New: ICE on valid code on x86_64-linux-gnu: in force_constant_size, at gimplify.c:671

2016-06-17 Thread su at cs dot ucdavis.edu

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71572

Bug ID: 71572
   Summary: ICE on valid code on x86_64-linux-gnu: in
force_constant_size, at gimplify.c:671
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu
  Target Milestone: ---

The following code causes an ICE when compiled with the current gcc trunk on
x86_64-linux-gnu in both 32-bit and 64-bit modes.

It seems to affect all versions 4.7.x and later. 


$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160617 (experimental) [trunk revision 237557] (GCC)
$
$ clang-3.8 -c small.c
$
$ gcc-trunk -c small.c
small.c: In function ‘fn1’:
small.c:7:3: internal compiler error: in force_constant_size, at gimplify.c:671
   __asm ("":"+g" (b));
   ^
0x946f46 force_constant_size
../../gcc-source-trunk/gcc/gimplify.c:671
0x94d897 gimple_add_tmp_var(tree_node*)
../../gcc-source-trunk/gcc/gimplify.c:709
0x92765a create_tmp_var(tree_node*, char const*)
../../gcc-source-trunk/gcc/gimple-expr.c:476
0x959194 create_tmp_from_val
../../gcc-source-trunk/gcc/gimplify.c:500
0x959194 lookup_tmp_var
../../gcc-source-trunk/gcc/gimplify.c:521
0x959194 internal_get_tmp_var
../../gcc-source-trunk/gcc/gimplify.c:574
0x9519c8 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../gcc-source-trunk/gcc/gimplify.c:11374
0x95fcfe gimplify_asm_expr
../../gcc-source-trunk/gcc/gimplify.c:5471
0x952ff5 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../gcc-source-trunk/gcc/gimplify.c:10754
0x9565a6 gimplify_stmt(tree_node**, gimple**)
../../gcc-source-trunk/gcc/gimplify.c:5767
0x95329b gimplify_statement_list
../../gcc-source-trunk/gcc/gimplify.c:1549
0x95329b gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../gcc-source-trunk/gcc/gimplify.c:10851
0x9565a6 gimplify_stmt(tree_node**, gimple**)
../../gcc-source-trunk/gcc/gimplify.c:5767
0x9577c7 gimplify_bind_expr
../../gcc-source-trunk/gcc/gimplify.c:1154
0x9527f4 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../gcc-source-trunk/gcc/gimplify.c:10633
0x9565a6 gimplify_stmt(tree_node**, gimple**)
../../gcc-source-trunk/gcc/gimplify.c:5767
0x9582cb gimplify_body(tree_node*, bool)
../../gcc-source-trunk/gcc/gimplify.c:11617
0x958926 gimplify_function_tree(tree_node*)
../../gcc-source-trunk/gcc/gimplify.c:11773
0x7d5667 cgraph_node::analyze()
../../gcc-source-trunk/gcc/cgraphunit.c:625
0x7d81f0 analyze_functions
../../gcc-source-trunk/gcc/cgraphunit.c:1086
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
$


--


int a;

void
fn1 ()
{ 
  int b[a];
  __asm ("":"+g" (b));
}

[Bug target/71338] [RL78] mulu instruction not used on G10

2016-06-17 Thread dj at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71338

--- Comment #2 from dj at gcc dot gnu.org  ---
Author: dj
Date: Fri Jun 17 22:24:17 2016
New Revision: 237566

URL: https://gcc.gnu.org/viewcvs?rev=237566=gcc=rev
Log:
PR target/71338
* config/rl78/rl78-expand.c (umulqihi3): Enable for G10.
* config/rl78/rl78-virtual.c (umulhi3_shift_virt): Likewise.
(umulqihi3_virt): Likewise.
* config/rl78/rl78-real.c (umulhi3_shift_real): Likewise.
(umulqihi3_real): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rl78/rl78-expand.md
trunk/gcc/config/rl78/rl78-real.md
trunk/gcc/config/rl78/rl78-virt.md

[target/71338]: enable mulu for RL78/G10

2016-06-17 Thread DJ Delorie


Reverts https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01538.html - G10
supports MULU but not other multiplication methods.  Committed.

PR target/71338
* config/rl78/rl78-expand.c (umulqihi3): Enable for G10.
* config/rl78/rl78-virtual.c (umulhi3_shift_virt): Likewise.
(umulqihi3_virt): Likewise.
* config/rl78/rl78-real.c (umulhi3_shift_real): Likewise.
(umulqihi3_real): Likewise.

Index: gcc/config/rl78/rl78-expand.md
===
--- gcc/config/rl78/rl78-expand.md  (revision 237565)
+++ gcc/config/rl78/rl78-expand.md  (working copy)
@@ -156,13 +156,13 @@
 )
 
 (define_expand "umulqihi3"
   [(set (match_operand:HI 0 "register_operand")
 (mult:HI (zero_extend:HI (match_operand:QI 1 "register_operand"))
  (zero_extend:HI (match_operand:QI 2 "register_operand"]
-  "!TARGET_G10"
+  ""
   ""
 )
 
 (define_expand "andqi3"
   [(set (match_operand:QI 0 "rl78_nonimmediate_operand")
(and:QI (match_operand:QI 1 "rl78_general_operand")
Index: gcc/config/rl78/rl78-real.md
===
--- gcc/config/rl78/rl78-real.md(revision 237565)
+++ gcc/config/rl78/rl78-real.md(working copy)
@@ -176,23 +176,23 @@
 )
 
 (define_insn "*umulhi3_shift_real"
   [(set (match_operand:HI 0 "register_operand" "=A,A")
 (mult:HI (match_operand:HI 1 "rl78_nonfar_operand" "0,0")
  (match_operand:HI 2 "rl78_24_operand" "N,i")))]
-  "rl78_real_insns_ok () && !TARGET_G10"
+  "rl78_real_insns_ok ()"
   "@
shlw\t%0, 1
shlw\t%0, 2"
 )
 
 (define_insn "*umulqihi3_real"
   [(set (match_operand:HI 0 "nonimmediate_operand" "=A")
 (mult:HI (zero_extend:HI (match_operand:QI 1 "general_operand" "%a"))
  (zero_extend:HI (match_operand:QI 2 "general_operand" "x"]
-  "rl78_real_insns_ok () && !TARGET_G10"
+  "rl78_real_insns_ok ()"
   "mulu\t%2"
 )
 
 (define_insn "*andqi3_real"
   [(set (match_operand:QI 0 "rl78_nonimmediate_operand"  
"=WsfWsaWhlWab,A,R,vWsa")
(and:QI (match_operand:QI 1 "rl78_general_operand"   "%0,0,0,0")
Index: gcc/config/rl78/rl78-virt.md
===
--- gcc/config/rl78/rl78-virt.md(revision 237565)
+++ gcc/config/rl78/rl78-virt.md(working copy)
@@ -113,22 +113,22 @@
 )
 
 (define_insn "*umulhi3_shift_virt"
   [(set (match_operand:HI  0 "register_operand" "=v")
 (mult:HI (match_operand:HI 1 "rl78_nonfar_operand" "%vim")
  (match_operand:HI 2 "rl78_24_operand" "Ni")))]
-  "rl78_virt_insns_ok () && !TARGET_G10"
+  "rl78_virt_insns_ok ()"
   "v.mulu\t%0, %1, %2"
   [(set_attr "valloc" "umul")]
 )
 
 (define_insn "*umulqihi3_virt"
   [(set (match_operand:HI  0 "register_operand" "=v")
 (mult:HI (zero_extend:HI (match_operand:QI 1 "rl78_nonfar_operand" 
"%vim"))
  (zero_extend:HI (match_operand:QI 2 "general_operand" 
"vim"]
-  "rl78_virt_insns_ok () && !TARGET_G10"
+  "rl78_virt_insns_ok ()"
   "v.mulu\t%0, %2"
   [(set_attr "valloc" "umul")]
 )
 
 (define_insn "*andqi3_virt"
   [(set (match_operand:QI 0 "rl78_nonimmediate_operand" "=vm,  *Wfr,  
vY")

[Bug bootstrap/71435] [7 regression] sparc bootstrap failure since r235625

2016-06-17 Thread ebotcazou at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71435

--- Comment #12 from Eric Botcazou  ---
> The tentative fix restored bootstrap on sparc-linux.  Test results look
> OK-ish.

Thanks, I'll post it tomorrow morning.

[PATCH] input.c: add lexing selftests and a test matrix for line_table states

2016-06-17 Thread David Malcolm

This patch adds explicit testing of lexing a source file,
generalizing this (and the test of ordinary line maps) over
a 2-dimensional test matrix covering:

  (1) line_table->default_range_bits: some frontends use a non-zero value
  and others use zero

  (2) the fallback modes within line-map.c: there are various threshold
  values for source_location/location_t beyond line-map.c changes
  behavior (disabling of the range-packing optimization, disabling
  of column-tracking).  We exercise these by starting the line_table
  at interesting values at or near these thresholds.

This helps ensures that location data works in all of these states,
and that (I hope) we don't have lingering bugs relating to the
transition between line_table states.

Successfully bootstrapped on x86_64-pc-linux-gnu;
Successful -fself-test of stage1 on powerpc-ibm-aix7.1.3.0.

OK for trunk?  (I can self-approve much of this, but it's probably
worth having another pair of eyes look at it, if nothing else).

gcc/ChangeLog:
* input.c: Include cpplib.h.
(selftest::temp_source_file): New class.
(selftest::temp_source_file::temp_source_file): New ctor.
(selftest::temp_source_file::~temp_source_file): New dtor.
(selftest::should_have_column_data_p): New function.
(selftest::test_should_have_column_data_p): New function.
(selftest::temp_line_table): New class.
(selftest::temp_line_table::temp_line_table): New ctor.
(selftest::temp_line_table::~temp_line_table): New dtor.
(selftest::test_accessing_ordinary_linemaps): Add case_ param; use
it to create a temp_line_table.
(selftest::assert_loceq): Only verify LOCATION_COLUMN for
locations that are known to have column data.
(selftest::line_table_case): New struct.
(selftest::test_reading_source_line): Move tempfile handling
to class temp_source_file.
(ASSERT_TOKEN_AS_TEXT_EQ): New macro.
(selftest::assert_token_loc_eq): New function.
(ASSERT_TOKEN_LOC_EQ): New macro.
(selftest::test_lexer): New function.
(selftest::boundary_locations): New array.
(selftest::input_c_tests): Call test_should_have_column_data_p.
Loop over a test matrix of interesting values of location and
default_range_bits, calling test_lexer on each case in the matrix.
Move call to test_accessing_ordinary_linemaps into the matrix.
* selftest.h (ASSERT_EQ): Reimplement in terms of...
(ASSERT_EQ_AT): New macro.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/location_overflow_plugin.c (plugin_init): Avoid
hardcoding the values of LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES
and LINE_MAP_MAX_LOCATION_WITH_COLS.

libcpp/ChangeLog:
* include/line-map.h (LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES):
Move here from line-map.c.
(LINE_MAP_MAX_LOCATION_WITH_COLS): Likewise.
* line-map.c (LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES): Move from
here to line-map.h.
(LINE_MAP_MAX_LOCATION_WITH_COLS): Likewise.
---
 gcc/input.c| 323 +++--
 gcc/selftest.h |  12 +-
 .../gcc.dg/plugin/location_overflow_plugin.c   |   4 +-
 libcpp/include/line-map.h  |  10 +
 libcpp/line-map.c  |  12 -
 5 files changed, 327 insertions(+), 34 deletions(-)

diff --git a/gcc/input.c b/gcc/input.c
index 3fb4a25..0016555 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "intl.h"
 #include "diagnostic-core.h"
 #include "selftest.h"
+#include "cpplib.h"
 
 /* This is a cache used by get_next_line to store the content of a
file to be searched for file lines.  */
@@ -1144,6 +1145,74 @@ namespace selftest {
 
 /* Selftests of location handling.  */
 
+/* A class for writing out a temporary sourcefile for use in selftests
+   of input handling.  */
+
+class temp_source_file
+{
+ public:
+  temp_source_file (const location , const char *suffix,
+   const char *content);
+  ~temp_source_file ();
+
+  const char *get_filename () const { return m_filename; }
+
+ private:
+  char *m_filename;
+};
+
+/* Constructor.  Create a tempfile using SUFFIX, and write CONTENT to
+   it.  Abort if anything goes wrong, using LOC as the effective
+   location in the problem report.  */
+
+temp_source_file::temp_source_file (const location , const char *suffix,
+   const char *content)
+{
+  m_filename = make_temp_file (suffix);
+  ASSERT_NE (m_filename, NULL);
+
+  FILE *out = fopen (m_filename, "w");
+  if (!out)
+::selftest::fail_formatted (loc, "unable to open tempfile: %s",
+   m_filename);
+  fprintf (out, content);
+  fclose (out);
+}
+
+/* Destructor.  Delete the tempfile.  */
+
+temp_source_file::~temp_source_file

[PATCH/AARCH64] Accept vulcan as a cpu name for the AArch64 port of GCC

2016-06-17 Thread Virendra Pathak

Hi,

Please find the patch for introducing vulcan as a cpu name for the
AArch64 port of GCC.
Broadcom's vulcan is an armv8.1-a aarch64 server processor.

Since vulcan is the first armv8.1-a processor to be introduced in
aarch64-cores.def,
I have created a new section in the file for the armv8.1 based processors.
Kindly let me know if that is okay.

Tested the patch with cross aarch64-linux-gnu, bootstrapped native
aarch64-unknown-linux-gnu
and make check (gcc, ld, gas, binutils, gdb).
No new regression failure is added by this patch.

In addition, tested -mcpu=vulcan -mtune=vulcan flags by passing them
via command line.
Also verified that above flags passes armv8.1-a option to assembler(as).

At present we are using schedule & cost model of cortex-a57 but
soon we will be submitting one for vulcan.

Please review the patch.
Ok for trunk?


gcc/ChangeLog:

Virendra Pathak 

* config/aarch64/aarch64-cores.def (vulcan): New core.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Document vulcan as an available option.




with regards,
Virendra Pathak
From be0c77cce98d6dffe7b8d607df25ecb4386a1d34 Mon Sep 17 00:00:00 2001
From: Virendra Pathak 
Date: Mon, 13 Jun 2016 03:18:08 -0700
Subject: [PATCH] [AArch64] Accept vulcan as a cpu name for the AArch64 port of
 GCC

---
 gcc/config/aarch64/aarch64-cores.def | 4 
 gcc/config/aarch64/aarch64-tune.md   | 2 +-
 gcc/doc/invoke.texi  | 4 ++--
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 251a3eb..ced8f94 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -49,6 +49,10 @@ AARCH64_CORE("qdf24xx", qdf24xx,   cortexa57, 8A,  
AARCH64_FL_FOR_ARCH8 | AA
 AARCH64_CORE("thunderx",thunderx,  thunderx,  8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  "0x43", "0x0a1")
 AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  AARCH64_FL_FOR_ARCH8, 
xgene1, "0x50", "0x000")
 
+/* V8.1 Architecture Processors.  */
+
+AARCH64_CORE("vulcan",  vulcan, cortexa57, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | 
AARCH64_FL_CRYPTO, cortexa57, "0x42", "0x516")
+
 /* V8 big.LITTLE implementations.  */
 
 AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07.0xd03")
diff --git a/gcc/config/aarch64/aarch64-tune.md 
b/gcc/config/aarch64/aarch64-tune.md
index cbc6f48..8c4a0e9 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-   
"cortexa35,cortexa53,cortexa57,cortexa72,exynosm1,qdf24xx,thunderx,xgene1,cortexa57cortexa53,cortexa72cortexa53"
+   
"cortexa35,cortexa53,cortexa57,cortexa72,exynosm1,qdf24xx,thunderx,xgene1,vulcan,cortexa57cortexa53,cortexa72cortexa53"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index aa11209..2666592 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13063,8 +13063,8 @@ Specify the name of the target processor for which GCC 
should tune the
 performance of the code.  Permissible values for this option are:
 @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a57},
 @samp{cortex-a72}, @samp{exynos-m1}, @samp{qdf24xx}, @samp{thunderx},
-@samp{xgene1}, @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53},
-@samp{native}.
+@samp{xgene1}, @samp{vulcan}, @samp{cortex-a57.cortex-a53},
+@samp{cortex-a72.cortex-a53}, @samp{native}.
 
 The values @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53}
 specify that GCC should tune for a big.LITTLE system.
-- 
2.1.0

[Bug bootstrap/71435] [7 regression] sparc bootstrap failure since r235625

2016-06-17 Thread mikpelinux at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71435

--- Comment #11 from Mikael Pettersson  ---
(In reply to Mikael Pettersson from comment #10)
> (In reply to Eric Botcazou from comment #9)
> > > Created attachment 38699 [details]
> > > Tentative fix.
> > 
> > It successfully passed a full bootstrap/test cycle on SPARC/Solaris.
> 
> I can test it on SPARC/Linux on Friday.

The tentative fix restored bootstrap on sparc-linux.  Test results look OK-ish.

Re: _Bool and trap representations

2016-06-17 Thread Alexander Cherepanov


On 2016-06-15 17:15, Martin Sebor wrote:

There has been quite a bit of discussion among the committee on
this subject lately (the last part is the subject of DR #451,
though it's discussed in the context of uninitialized objects
with indeterminate values).


Are there notes from these discussions or something?


Notes from discussions during committee meetings are in the meeting
minutes that are posted along with other committee documents on the
public site.   Those that relate to aspects of defect reports are
usually captured in the Committee Discussion and Committee Response
to the DR.  Other than that, committee discussions that take place
on the committee mailing list (such as the recent ones on this topic)
are archived for reference of committee members (unlike C++, the C
archives are not intended to be open to the public).


So it seems the discussion you referred to is not public, that's 
unfortunate.


And to clarify what you wrote about stability of valid representations, 
is padding expected to be stable when it's not specifically set? I.e. is 
the following optimization supposed to be conforming or not?


Source code:

--
#include 

int main(int argc, char **argv)
{
  (void)argv;

  struct { char c; int i; } s = {0, 0};

  printf("%d\n", argc ? ((unsigned char *))[1] : 5);
  printf("%d\n", argc ? ((unsigned char *))[1] : 7);
}
--

Results:

--
$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
5
7
--

gcc version: gcc (GCC) 7.0.0 20160616 (experimental)

Of course, clang does essentially the same but the testcase is a bit 
more involved (I can post it if somebody is interested). OTOH clang is 
more predictable in this area because rules for dealing with undefined 
values in llvm are more-or-less documented -- 
http://llvm.org/docs/LangRef.html#undefined-values .


I don't see gcc treating padding in long double as indeterminate in the 
same way as in structs but clang seems to treat them equally.


--
Alexander Cherepanov

[Bug libgcc/71559] ICE in ix86_fp_cmp_code_to_pcmp_immediate, at config/i386/i386.c:23042 (KNL/AVX512)

2016-06-17 Thread joseph at codesourcery dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71559

--- Comment #11 from joseph at codesourcery dot com  ---
On Fri, 17 Jun 2016, jakub at gcc dot gnu.org wrote:

> The patch is completely untested though (and wonder if we have testcases for
> not raising exceptions when isgreater etc. arguments are qNaNs.

We probably don't have such tests - tests for exception raising or its 
absence are fairly limited.  Those functions are covered in the glibc 
testsuite, but of course that only covers whichever insn patterns get used 
for __builtin_is* for that particular build of the glibc tests.

There are definitely bugs on some architectures involving ordinary ordered 
comparisons such as < and >= wrongly using quiet instructions.  See bug 
52451 for i386 (x87 floating point) and bug 58684 for powerpc, for 
example.  A consequence of this is that if you add tests of comparisons 
doing the right thing, some of those tests would immediately fail on some 
architectures.

(These sorts of local bugs with particular operations or optimizations 
being incorrect regarding exceptions are certainly easier to fix than the 
issues with optimizations not being aware of exceptions and rounding modes 
as extra inputs / outputs to floating-point operations.  The ones with 
individual operations could I expect largely be found through thorough 
test coverage; those with optimizations might be harder to find.)

Note that there is some ambiguity about whether LTGT RTL (and 
corresponding GENERIC / GIMPLE) should be a quiet operation corresponding 
to islessgreater, or ((x < y) || (x > y)) raising exceptions for quiet 
NaNs.  See the discussion at 
.  Fixing 
the ambiguity in either direction would probably involve changes to the 
part of the compiler expecting the other semantics.

[Bug target/71571] [CRIS] Multiple inheritance non-virtual PIC thunk causes crash

2016-06-17 Thread gcc at davidrobins dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71571

David B. Robins  changed:

   What|Removed |Added

 Target||crisv32-axis-linux-gnu
   Host||x86_64-unknown-linux-gnu
  Known to fail||4.3.1, 7.0
  Build||x86_64-unknown-linux-gnu
   Severity|normal  |major

[Bug target/71571] [CRIS] Multiple inheritance non-virtual PIC thunk causes crash

2016-06-17 Thread gcc at davidrobins dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71571

--- Comment #2 from David B. Robins  ---
Created attachment 38721
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38721=edit
Larger repro - crashes reliably - requires additional -O2 -fno-inline

[Bug target/71571] [CRIS] Multiple inheritance non-virtual PIC thunk causes crash

2016-06-17 Thread gcc at davidrobins dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71571

--- Comment #1 from David B. Robins  ---
Created attachment 38720
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38720=edit
Small repro - emits branch (ba) without delay-slot instruction

[Bug target/71571] New: [CRIS] Multiple inheritance non-virtual PIC thunk causes crash

2016-06-17 Thread gcc at davidrobins dot net

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71571

Bug ID: 71571
   Summary: [CRIS] Multiple inheritance non-virtual PIC thunk
causes crash
   Product: gcc
   Version: 4.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at davidrobins dot net
  Target Milestone: ---

Created attachment 38719
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38719=edit
Add nop for ba (branch) delay slot in PIC multiple inheritance thunk

The jump to the primary function in the generated non-virtual thunk for the
CRIS platform is missing the delay slot instruction (an instruction following
the jump that is executed before it) in PIC mode, so it runs the following
instruction in the text segment instead. (This following instruction is
frequently the next function's preamble, usually a sub from $sp, which causes a
crash in a subsequent return or other stack lookup.)

We initially saw this problem on a vendor-provided (Axis) GCC 4.3.1, but it
repros on current SVN (r237536 when I checked it out). Details of GCC build:
gcc version 7.0.0 20160616 (experimental) (GCC)
Host: x86_64-unknown-linux-gnu
Target: crisv32-axis-linux-gnu
Configured with: ../gcc-svn/configure --prefix=/opt/cross
--target=crisv32-axis-linux-gnu --enable-languages=c,c++ --disable-multilib

The build command-line is: /opt/cross/bin/crisv32-axis-linux-gnu-g++ -fPIC
-save-temps -o test main.cpp (substitute main.ii)

I am attaching:
 * a repro program, main.ii; after reduction it (probably) doesn't crash but
the thunk with branch ("ba") without a delay-slot instruction can be seen in
the assembly,
 * a slightly larger repro, crash.ii, that reliably crashes (due to an
$sp-modifying instruction following the branch); this requires additional g++
options -O2 -fno-inline (the -O2 to eliminate a dword stored in memory after
the ba instruction, and -fno-inline to prevent the thunk from being an inline
copy of the function it would jump to)
 * a patch to fix the problem.

Runs of "make check-gcc-c++ RUNTESTFLAGS=--target_board=cris-sim" with/without
the fix do not differ in passes/failures. I would like to add a test for this
issue but am unsure as to the correct way of doing so; crash.ii could be used
in an execute test, or is checking the output assembly better, or something
else?

[Bug libstdc++/71545] [6/7 Regression] Incorrect irreflexive comparison debug check in std::lower_bound

2016-06-17 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71545

Jonathan Wakely  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jonathan Wakely  ---
Fixed

[PATCH] C++ FE: Show both locations in string literal concatenation error

2016-06-17 Thread David Malcolm

We can use rich_location and the new diagnostic_show_locus to print
both locations when complaining about a bogus string concatenation
in the C++ FE, giving e.g.:

test.C:3:24: error: unsupported non-standard concatenation of string literals
 const void *s = u8"a"  u"b";
 ~  ^~~~

Earlier versions of this were posted as part of
  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00730.html
"[PATCH 10/22] C++ FE: Use token ranges for various diagnostics"
and:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01497.html
though the implementation has changed slightly.

Successfully bootstrapped on x86_64-pc-linux-gnu;
adds 7 PASS results to g++.sum.

OK for trunk?

gcc/cp/ChangeLog:
* parser.c (cp_parser_string_literal): Convert non-standard
concatenation error to directly use a rich_location, and
use that to add the location of the first literal to the
diagnostic.

gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/string-literal-concat.C: New test case.
---
 gcc/cp/parser.c| 15 +-
 .../g++.dg/diagnostic/string-literal-concat.C  | 23 ++
 2 files changed, 33 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/string-literal-concat.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 632b25f..e1e9271 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -3893,13 +3893,12 @@ cp_parser_string_literal (cp_parser *parser, bool 
translate, bool wide_ok,
 }
   else
 {
-  location_t last_tok_loc;
+  location_t last_tok_loc = tok->location;
   gcc_obstack_init (_ob);
   count = 0;
 
   do
{
- last_tok_loc = tok->location;
  cp_lexer_consume_token (parser->lexer);
  count++;
  str.text = (const unsigned char *)TREE_STRING_POINTER (string_tree);
@@ -3931,13 +3930,19 @@ cp_parser_string_literal (cp_parser *parser, bool 
translate, bool wide_ok,
  if (type == CPP_STRING)
type = curr_type;
  else if (curr_type != CPP_STRING)
-   error_at (tok->location,
- "unsupported non-standard concatenation "
- "of string literals");
+   {
+ rich_location rich_loc (line_table, tok->location);
+ rich_loc.add_range (last_tok_loc, false);
+ error_at_rich_loc (_loc,
+"unsupported non-standard concatenation "
+"of string literals");
+   }
}
 
  obstack_grow (_ob, , sizeof (cpp_string));
 
+ last_tok_loc = tok->location;
+
  tok = cp_lexer_peek_token (parser->lexer);
  if (cpp_userdef_string_p (tok->type))
{
diff --git a/gcc/testsuite/g++.dg/diagnostic/string-literal-concat.C 
b/gcc/testsuite/g++.dg/diagnostic/string-literal-concat.C
new file mode 100644
index 000..4ede799
--- /dev/null
+++ b/gcc/testsuite/g++.dg/diagnostic/string-literal-concat.C
@@ -0,0 +1,23 @@
+/* { dg-options "-fdiagnostics-show-caret -std=c++11" } */
+
+const void *s = u8"a"  u"b";  // { dg-error "24: non-standard concatenation" }
+/* { dg-begin-multiline-output "" }
+ const void *s = u8"a"  u"b";
+ ~  ^~~~
+   { dg-end-multiline-output "" } */
+
+const void *s2 = u"a"  u"b"  u8"c";  // { dg-error "30: non-standard 
concatenation" }
+/* { dg-begin-multiline-output "" }
+ const void *s2 = u"a"  u"b"  u8"c";
+  ^
+  { dg-end-multiline-output "" } */
+
+#define TEST_U8_LITERAL u8"a"
+
+const void *s3 = TEST_U8_LITERAL u8"b";
+
+const void *s4 = TEST_U8_LITERAL u"b"; // { dg-error "34: non-standard 
concatenation" }
+/* { dg-begin-multiline-output "" }
+ const void *s4 = TEST_U8_LITERAL u"b";
+  ^~~~
+  { dg-end-multiline-output "" } */
-- 
1.8.5.3

[Bug libstdc++/71545] [6/7 Regression] Incorrect irreflexive comparison debug check in std::lower_bound

2016-06-17 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71545

--- Comment #4 from Jonathan Wakely  ---
Author: redi
Date: Fri Jun 17 18:53:46 2016
New Revision: 237561

URL: https://gcc.gnu.org/viewcvs?rev=237561=gcc=rev
Log:
libstdc++/71545 fix debug checks in binary search algorithms

PR libstdc++/71545
* include/bits/stl_algobase.h (lower_bound, lexicographical_compare):
Remove irreflexive checks.
* include/bits/stl_algo.h (lower_bound, upper_bound, equal_range,
binary_search): Likewise.
* testsuite/25_algorithms/equal_range/partitioned.cc: New test.
* testsuite/25_algorithms/lexicographical_compare/71545.cc: New test.
* testsuite/25_algorithms/lower_bound/partitioned.cc: New test.
* testsuite/25_algorithms/upper_bound/partitioned.cc: New test.
* testsuite/util/testsuite_iterators.h (__gnu_test::test_container):
Add constructor from array.

Added:
   
branches/gcc-6-branch/libstdc++-v3/testsuite/25_algorithms/binary_search/partitioned.cc
   
branches/gcc-6-branch/libstdc++-v3/testsuite/25_algorithms/equal_range/partitioned.cc
   
branches/gcc-6-branch/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/71545.cc
   
branches/gcc-6-branch/libstdc++-v3/testsuite/25_algorithms/lower_bound/partitioned.cc
   
branches/gcc-6-branch/libstdc++-v3/testsuite/25_algorithms/upper_bound/partitioned.cc
Modified:
branches/gcc-6-branch/libstdc++-v3/ChangeLog
branches/gcc-6-branch/libstdc++-v3/include/bits/stl_algo.h
branches/gcc-6-branch/libstdc++-v3/include/bits/stl_algobase.h
branches/gcc-6-branch/libstdc++-v3/testsuite/util/testsuite_iterators.h

[Bug libgcc/71559] ICE in ix86_fp_cmp_code_to_pcmp_immediate, at config/i386/i386.c:23042 (KNL/AVX512)

2016-06-17 Thread hjl.tools at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71559

--- Comment #10 from H.J. Lu  ---
This is related to PR 37158?

[Bug libstdc++/71545] [6/7 Regression] Incorrect irreflexive comparison debug check in std::lower_bound

2016-06-17 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71545

--- Comment #3 from Jonathan Wakely  ---
Author: redi
Date: Fri Jun 17 18:28:34 2016
New Revision: 237560

URL: https://gcc.gnu.org/viewcvs?rev=237560=gcc=rev
Log:
libstdc++/71545 fix debug checks in binary search algorithms

PR libstdc++/71545
* include/bits/stl_algobase.h (lower_bound, lexicographical_compare):
Remove irreflexive checks.
* include/bits/stl_algo.h (lower_bound, upper_bound, equal_range,
binary_search): Likewise.
* testsuite/25_algorithms/equal_range/partitioned.cc: New test.
* testsuite/25_algorithms/lexicographical_compare/71545.cc: New test.
* testsuite/25_algorithms/lower_bound/partitioned.cc: New test.
* testsuite/25_algorithms/upper_bound/partitioned.cc: New test.
* testsuite/util/testsuite_iterators.h (__gnu_test::test_container):
Add constructor from array.

Added:
trunk/libstdc++-v3/testsuite/25_algorithms/binary_search/partitioned.cc
trunk/libstdc++-v3/testsuite/25_algorithms/equal_range/partitioned.cc
trunk/libstdc++-v3/testsuite/25_algorithms/lexicographical_compare/71545.cc
trunk/libstdc++-v3/testsuite/25_algorithms/lower_bound/partitioned.cc
trunk/libstdc++-v3/testsuite/25_algorithms/upper_bound/partitioned.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/stl_algo.h
trunk/libstdc++-v3/include/bits/stl_algobase.h
trunk/libstdc++-v3/testsuite/util/testsuite_iterators.h

[PATCH] libstdc++/71545 fix debug checks in binary search algorithms

2016-06-17 Thread Jonathan Wakely


PR libstdc++/71545
* include/bits/stl_algobase.h (lower_bound, lexicographical_compare):
Remove irreflexive checks.
* include/bits/stl_algo.h (lower_bound, upper_bound, equal_range,
binary_search): Likewise.
* testsuite/25_algorithms/equal_range/partitioned.cc: New test.
* testsuite/25_algorithms/lexicographical_compare/71545.cc: New test.
* testsuite/25_algorithms/lower_bound/partitioned.cc: New test.
* testsuite/25_algorithms/upper_bound/partitioned.cc: New test.
* testsuite/util/testsuite_iterators.h (__gnu_test::test_container):
Add constructor from array.

The binary search algos and lexicographical_compare do not require the
comparison function to be irreflexive, so the recently-added debug
mode checks need to be removed.

Tested x86_64-linux, committed to trunk. gcc-6-branch commit to
follow.
commit e775b35ff6cb0b6843ec2f1c8bf3a136deb898dd
Author: Jonathan Wakely 
Date:   Fri Jun 17 11:09:18 2016 +0100

libstdc++/71545 fix debug checks in binary search algorithms

PR libstdc++/71545
* include/bits/stl_algobase.h (lower_bound, lexicographical_compare):
Remove irreflexive checks.
* include/bits/stl_algo.h (lower_bound, upper_bound, equal_range,
binary_search): Likewise.
* testsuite/25_algorithms/equal_range/partitioned.cc: New test.
* testsuite/25_algorithms/lexicographical_compare/71545.cc: New test.
* testsuite/25_algorithms/lower_bound/partitioned.cc: New test.
* testsuite/25_algorithms/upper_bound/partitioned.cc: New test.
* testsuite/util/testsuite_iterators.h (__gnu_test::test_container):
Add constructor from array.

diff --git a/libstdc++-v3/include/bits/stl_algo.h 
b/libstdc++-v3/include/bits/stl_algo.h
index fbd03a7..c2ac031 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -2026,7 +2026,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
typename iterator_traits<_ForwardIterator>::value_type, _Tp>)
   __glibcxx_requires_partitioned_lower_pred(__first, __last,
__val, __comp);
-  __glibcxx_requires_irreflexive_pred2(__first, __last, __comp);
 
   return std::__lower_bound(__first, __last, __val,
__gnu_cxx::__ops::__iter_comp_val(__comp));
@@ -2080,7 +2079,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __glibcxx_function_requires(_LessThanOpConcept<
_Tp, typename iterator_traits<_ForwardIterator>::value_type>)
   __glibcxx_requires_partitioned_upper(__first, __last, __val);
-  __glibcxx_requires_irreflexive2(__first, __last);
 
   return std::__upper_bound(__first, __last, __val,
__gnu_cxx::__ops::__val_less_iter());
@@ -2112,7 +2110,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_Tp, typename iterator_traits<_ForwardIterator>::value_type>)
   __glibcxx_requires_partitioned_upper_pred(__first, __last,
__val, __comp);
-  __glibcxx_requires_irreflexive_pred2(__first, __last, __comp);
 
   return std::__upper_bound(__first, __last, __val,
__gnu_cxx::__ops::__val_comp_iter(__comp));
@@ -2186,7 +2183,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_Tp, typename iterator_traits<_ForwardIterator>::value_type>)
   __glibcxx_requires_partitioned_lower(__first, __last, __val);
   __glibcxx_requires_partitioned_upper(__first, __last, __val);
-  __glibcxx_requires_irreflexive2(__first, __last);
 
   return std::__equal_range(__first, __last, __val,
__gnu_cxx::__ops::__iter_less_val(),
@@ -2225,7 +2221,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__val, __comp);
   __glibcxx_requires_partitioned_upper_pred(__first, __last,
__val, __comp);
-  __glibcxx_requires_irreflexive_pred2(__first, __last, __comp);
 
   return std::__equal_range(__first, __last, __val,
__gnu_cxx::__ops::__iter_comp_val(__comp),
@@ -2255,7 +2250,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_Tp, typename iterator_traits<_ForwardIterator>::value_type>)
   __glibcxx_requires_partitioned_lower(__first, __last, __val);
   __glibcxx_requires_partitioned_upper(__first, __last, __val);
-  __glibcxx_requires_irreflexive2(__first, __last);
 
   _ForwardIterator __i
= std::__lower_bound(__first, __last, __val,
@@ -2291,7 +2285,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__val, __comp);
   __glibcxx_requires_partitioned_upper_pred(__first, __last,
__val, __comp);
-  __glibcxx_requires_irreflexive_pred2(__first, __last, __comp);
 
   _ForwardIterator __i
=

[Bug c++/71548] Invalid declaration involving template template param causes crash

2016-06-17 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71548

Martin Sebor  changed:

   What|Removed |Added

   Keywords||ice-on-invalid-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-06-17
 CC||msebor at gcc dot gnu.org
 Ever confirmed|0   |1
  Known to fail||4.7.0, 5.3.0, 6.1.0, 7.0

--- Comment #1 from Martin Sebor  ---
Confirmed as ice-on-invalid-code.  My bisection points to r179441 as the first
commit that triggered an ICE but since the commit isn't relevant to the C++
front end it must have been one before it and after r179435 which fails with:
sorry, unimplemented: cannot expand ‘T ...’ into a fixed-length argument list
Looking at the commits in between r179436 is the only possible root cause.

Recent trunk fails with the following output:

$ cat t.C && gcc -Wall t.C
template class fl {};
template class = fl>
struct S {};
template
void f(S ) {}
void lol() {
S<> s;
f(s);
}
t.C: In substitution of ‘template void f(S) [with T =
]’:
t.C:8:8:   required from here
t.C:8:8: internal compiler error: tree check: expected class ‘type’, have
‘declaration’ (template_decl) in cp_type_quals, at cp/typeck.c:9154
 f(s);
^
0x13ad713 tree_class_check_failed(tree_node const*, tree_code_class, char
const*, int, char const*)
../../gcc/tree.c:9803
0x72a08d tree_class_check(tree_node const*, tree_code_class, char const*, int,
char const*)
../../gcc/tree.h:3409
0x911f1e cp_type_quals(tree_node const*)
../../gcc/cp/typeck.c:9154
0x803b7c check_cv_quals_for_unify
../../gcc/cp/pt.c:19053
0x807fab unify
../../gcc/cp/pt.c:19707
0x8010f4 unify_one_argument
../../gcc/cp/pt.c:18292
0x8045b2 unify_pack_expansion
../../gcc/cp/pt.c:19203
0x808fac unify
../../gcc/cp/pt.c:19969
0x80382f try_class_unification
../../gcc/cp/pt.c:18965
0x8093af unify
../../gcc/cp/pt.c:1
0x8010f4 unify_one_argument
../../gcc/cp/pt.c:18292
0x80139d type_unification_real
../../gcc/cp/pt.c:18366
0x7ff912 fn_type_unification(tree_node*, tree_node*, tree_node*, tree_node*
const*, unsigned int, tree_node*, unification_kind_t, int, bool, bool)
../../gcc/cp/pt.c:17811
0x73495c add_template_candidate_real
../../gcc/cp/call.c:3110
0x734db1 add_template_candidate
../../gcc/cp/call.c:3188
0x73c9ed add_candidates
../../gcc/cp/call.c:5361
0x737b34 perform_overload_resolution
../../gcc/cp/call.c:4045
0x737d9b build_new_function_call(tree_node*, vec**, bool, int)
../../gcc/cp/call.c:4122
0x954894 finish_call_expr(tree_node*, vec**, bool,
bool, int)
../../gcc/cp/semantics.c:2433
0x898a02 cp_parser_postfix_expression
../../gcc/cp/parser.c:6904
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug libgcc/71559] ICE in ix86_fp_cmp_code_to_pcmp_immediate, at config/i386/i386.c:23042 (KNL/AVX512)

2016-06-17 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71559

--- Comment #9 from Jakub Jelinek  ---
If I compile the above testcase with -O2 -mavx -ftrapping-math, then it
generates vucomiss in each case, which seems wrong to me (because for
gt/ge/lt/le it should raise exceptions, so IMHO should use vcomiss in that
case).

[Bug libgcc/71559] ICE in ix86_fp_cmp_code_to_pcmp_immediate, at config/i386/i386.c:23042 (KNL/AVX512)

2016-06-17 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71559

Jakub Jelinek  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com,
   ||jsm28 at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek  ---
#define N 1024
float a[N], b[N];
int c[N];

void eq () { int i; for (i = 0; i < N; i++) c[i] = a[i] == b[i]; }
void ne () { int i; for (i = 0; i < N; i++) c[i] = a[i] != b[i]; }
void gt () { int i; for (i = 0; i < N; i++) c[i] = a[i] > b[i]; }
void ge () { int i; for (i = 0; i < N; i++) c[i] = a[i] >= b[i]; }
void lt () { int i; for (i = 0; i < N; i++) c[i] = a[i] < b[i]; }
void le () { int i; for (i = 0; i < N; i++) c[i] = a[i] <= b[i]; }
void unle () { int i; for (i = 0; i < N; i++) c[i] = !__builtin_isgreater
(a[i], b[i]); }
void unlt () { int i; for (i = 0; i < N; i++) c[i] = !__builtin_isgreaterequal
(a[i], b[i]); }
void unge () { int i; for (i = 0; i < N; i++) c[i] = !__builtin_isless (a[i],
b[i]); }
void ungt () { int i; for (i = 0; i < N; i++) c[i] = !__builtin_islessequal
(a[i], b[i]); }
void uneq () { int i; for (i = 0; i < N; i++) c[i] = !__builtin_islessgreater
(a[i], b[i]); }
void ordered () { int i; for (i = 0; i < N; i++) c[i] = !__builtin_isunordered
(a[i], b[i]); }
void unordered () { int i; for (i = 0; i < N; i++) c[i] = __builtin_isunordered
(a[i], b[i]); }

shows the various codes in vcond.
From C99 and other sources, all of the
isgreater/isequal/isless/isequal/islessgreater return false if any argument is
NaN and don't raise exceptions (except for sNaN). isunordered returns true only
if any argument is NaN and doesn't raise exceptions either.
The matching of the above to RTX codes has been confirmed by compiling the
above testcase.
Thus, IMNSHO the right values are:
 A > BA < BA = BUNORDSIGNALIMM
EQ FFTF N   0
NE TTFT N   4
GT TFFF Y   0xe
GE TFTF Y   0xd
LT FTFF Y   1
LE FTTF Y   2
UNLE   FTTT N   0x1a
UNLT   FTFT N   0x19
UNGE   TFTT N   0x15
UNGT   TFFT N   0x16
UNEQ   FFTT N   8
LTGT   TTFF N   0xc
ORDEREDTTTF N   7
UNORDERED  FFFT N   3

This is in sync with the 'D' stuff except for UN{LE,LT,GE,GT,EQ} where the AVX
implementation uses the signalling instructions instead of non-signalling. 
Unless there is some bug in the generic code, I'd say if one gets UNLE for
inverted isgreater, then in the above table
one needs to replace all Ts for Fs and vice versa, but keep Y and N as is
(because the fact whether the insn raises exception or not just depends on the
arguments (and not even on their order), not on whether the result is inverted
(nor arguments swapped).

So I'd expect something like:
--- gcc/config/i386/i386.c.jj   2016-06-16 21:00:08.0 +0200
+++ gcc/config/i386/i386.c  2016-06-17 19:35:52.237836780 +0200
@@ -17628,7 +17628,7 @@ ix86_print_operand (FILE *file, rtx x, i
case UNEQ:
  if (TARGET_AVX)
{
- fputs ("eq_us", file);
+ fputs ("eq_uq", file);
  break;
}
case EQ:
@@ -17637,7 +17637,7 @@ ix86_print_operand (FILE *file, rtx x, i
case UNLT:
  if (TARGET_AVX)
{
- fputs ("nge", file);
+ fputs ("nge_uq", file);
  break;
}
case LT:
@@ -17646,7 +17646,7 @@ ix86_print_operand (FILE *file, rtx x, i
case UNLE:
  if (TARGET_AVX)
{
- fputs ("ngt", file);
+ fputs ("ngt_uq", file);
  break;
}
case LE:
@@ -17671,7 +17671,10 @@ ix86_print_operand (FILE *file, rtx x, i
  break;
}
case UNGE:
- fputs ("nlt", file);
+ if (TARGET_AVX)
+   fputs ("nlt_uq", file);
+ else
+   fputs ("nlt", file);
  break;
case GT:
  if (TARGET_AVX)
@@ -17680,7 +17683,10 @@ ix86_print_operand (FILE *file, rtx x, i
  break;
}
case UNGT:
- fputs ("nle", file);
+ if (TARGET_AVX)
+   fputs ("nle_uq",

[Bug c++/71570] New: ICE on invalid C++11 code (with invalid variable capture) on x86_64-linux-gnu: in cxx_incomplete_type_diagnostic, at cp/typeck2.c:551

2016-06-17 Thread su at cs dot ucdavis.edu

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71570

Bug ID: 71570
   Summary: ICE on invalid C++11 code (with invalid variable
capture) on x86_64-linux-gnu: in
cxx_incomplete_type_diagnostic, at cp/typeck2.c:551
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu
  Target Milestone: ---

The following C++11 code causes an ICE when compiled with the current GCC trunk
on x86_64-linux-gnu in both 32-bit and 64-bit modes.  

It also affects 6.1.x and is a regression from 5.4.x. 


$ g++-trunk -v
Using built-in specs.
COLLECT_GCC=g++-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto
--prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160617 (experimental) [trunk revision 237557] (GCC) 
$ 
$ g++-trunk -c small.cpp
small.cpp: In function ‘void foo()’:
small.cpp:5:5: error: cannot capture ‘foo’ by reference
   []
 ^~~
small.cpp: In lambda function:
small.cpp:7:11: internal compiler error: in cxx_incomplete_type_diagnostic, at
cp/typeck2.c:551
 foo (0);
   ^
0x727e2d cxx_incomplete_type_diagnostic(unsigned int, tree_node const*,
tree_node const*, diagnostic_t)
../../gcc-source-trunk/gcc/cp/typeck2.c:551
0x7bffd5 cxx_incomplete_type_diagnostic
../../gcc-source-trunk/gcc/cp/cp-tree.h:6757
0x7bffd5 complete_type_or_maybe_complain(tree_node*, tree_node*, int)
../../gcc-source-trunk/gcc/cp/typeck.c:152
0x7cde76 decay_conversion(tree_node*, int, bool)
../../gcc-source-trunk/gcc/cp/typeck.c:2044
0x661e52 build_addr_func(tree_node*, int)
../../gcc-source-trunk/gcc/cp/call.c:282
0x7d0f6e cp_build_function_call_vec(tree_node*, vec<tree_node*, va_gc,
vl_embed>**, int)
../../gcc-source-trunk/gcc/cp/typeck.c:3583
0x817680 finish_call_expr(tree_node*, vec<tree_node*, va_gc, vl_embed>**, bool,
bool, int)
../../gcc-source-trunk/gcc/cp/semantics.c:2454
0x78ee50 cp_parser_postfix_expression
../../gcc-source-trunk/gcc/cp/parser.c:6904
0x78d4ac cp_parser_unary_expression
../../gcc-source-trunk/gcc/cp/parser.c:7986
0x797327 cp_parser_cast_expression
../../gcc-source-trunk/gcc/cp/parser.c:8663
0x797925 cp_parser_binary_expression
../../gcc-source-trunk/gcc/cp/parser.c:8765
0x798210 cp_parser_assignment_expression
../../gcc-source-trunk/gcc/cp/parser.c:9053
0x79ab09 cp_parser_expression
../../gcc-source-trunk/gcc/cp/parser.c:9220
0x79b12f cp_parser_expression_statement
../../gcc-source-trunk/gcc/cp/parser.c:10681
0x7a9f6b cp_parser_statement
../../gcc-source-trunk/gcc/cp/parser.c:10532
0x7aac2c cp_parser_statement_seq_opt
../../gcc-source-trunk/gcc/cp/parser.c:10804
0x7b03be cp_parser_lambda_body
../../gcc-source-trunk/gcc/cp/parser.c:10270
0x7b03be cp_parser_lambda_expression
../../gcc-source-trunk/gcc/cp/parser.c:9754
0x7848dc cp_parser_primary_expression
../../gcc-source-trunk/gcc/cp/parser.c:4934
0x78f076 cp_parser_postfix_expression
../../gcc-source-trunk/gcc/cp/parser.c:6691
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
$ 


-


void foo (int);

void foo (void)
{
  []
  {
foo (0); 
  };
}

Re: [C++ PATCH] Fix some DECL_BUILT_IN uses in C++ FE

2016-06-17 Thread Jason Merrill

OK.

Jason

Re: [C++ Patch] One more error + error to error + inform and a subtler issue

2016-06-17 Thread Jason Merrill

On Wed, Jun 15, 2016 at 5:15 AM, Paolo Carlini  wrote:
> +  /* Likewise for the constexpr specifier, in case t is a specialization
> + and we are emitting an error about an incompatible redeclaration.  */

It doesn't need to be in an error about a redeclaration; in general a
specialization can differ in 'constexpr' from its template.  OK with
the second line removed from the comment.

Jason

[Bug c++/71569] [5/6] Crash: External definition of template member from template struct

2016-06-17 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71569

Martin Sebor  changed:

   What|Removed |Added

   Keywords||ice-on-invalid-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-06-17
 CC||jason at gcc dot gnu.org,
   ||msebor at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Martin Sebor  ---
Confirmed.  The ICE was introduced in 5.0 via the following commit:

r218104 | jason | 2014-11-26 16:58:38 -0500 (Wed, 26 Nov 2014) | 14 lines

Allow partial specialization of variable templates.
* cp-tree.h (TINFO_USED_TEMPLATE_ID): New.
* decl.c (duplicate_decls): Copy it.
* error.c (dump_decl) [TEMPLATE_ID_EXPR]: Handle variables.
* parser.c (cp_parser_decltype_expr): Do call finish_id_expression
on template-ids.
* pt.c (register_specialization): Remember variable template insts.
(instantiate_template_1): Find the matching partial specialization.
(check_explicit_specialization): Allow variable partial specialization.
(process_partial_specialization): Likewise.
(push_template_decl_real): Likewise.
(more_specialized_partial_spec): Rename from more_specialized_class.
(most_specialized_partial_spec): Rename from most_specialized_class.
(get_partial_spec_bindings): Rename from get_class_bindings.

Re: [C++ Patch] One more error + error to error + inform

2016-06-17 Thread Jason Merrill

OK.

Jason

Re: [PATCH, libgcc/ARM 1a/6] Fix Thumb-1 only == ARMv6-M & Thumb-2 only == ARMv7-M assumptions

2016-06-17 Thread Thomas Preudhomme

On Wednesday 01 June 2016 10:00:52 Ramana Radhakrishnan wrote:
> Please fix up the macros, post back and redo the test. Otherwise this
> is ok from a quick read.

What about the updated patch in attachment? As for the original patch, I've 
checked that code generation does not change for a number of combinations of 
ISAs (ARM/Thumb), optimization levels (Os/O2), and architectures (armv4, 
armv4t, armv5, armv5t, armv5te, armv6, armv6j, armv6k, armv6s-m, armv6kz, 
armv6t2, armv6z, armv6zk, armv7, armv7-a, armv7e-m, armv7-m, armv7-r, armv7ve, 
armv8-a, armv8-a+crc, iwmmxt and iwmmxt2).

Note, I renumbered this patch 1a to not make the numbering of other patches 
look strange. The CLZ part is now in patch 1b/7.

ChangeLog entries are now as follow:

*** gcc/ChangeLog ***

2016-05-23  Thomas Preud'homme  

* config/arm/elf.h: Use __ARM_ARCH_ISA_THUMB and __ARM_ARCH_ISA_ARM to
decide whether to prevent some libgcc routines being included for some
multilibs rather than __ARM_ARCH_6M__ and add comment to indicate the
link between this condition and the one in
libgcc/config/arm/lib1func.S.

*** gcc/testsuite/ChangeLog ***

2015-11-10  Thomas Preud'homme  

* lib/target-supports.exp (check_effective_target_arm_cortex_m): Use
__ARM_ARCH_ISA_ARM to test for Cortex-M devices.

*** libgcc/ChangeLog ***

2016-06-01  Thomas Preud'homme  

* config/arm/bpabi-v6m.S: Clarify what architectures is the
implementation suitable for.
* config/arm/lib1funcs.S (__prefer_thumb__): Define among other cases
for all Thumb-1 only targets.
(NOT_ISA_TARGET_32BIT): Define for Thumb-1 only targets.
(THUMB_LDIV0): Test for NOT_ISA_TARGET_32BIT rather than
__ARM_ARCH_6M__.
(EQUIV): Likewise.
(ARM_FUNC_ALIAS): Likewise.
(umodsi3): Add check to __ARM_ARCH_ISA_THUMB != 1 to guard the idiv
version.
(modsi3): Likewise.
(clzsi2): Test for NOT_ISA_TARGET_32BIT rather than __ARM_ARCH_6M__.
(clzdi2): Likewise.
(ctzsi2): Likewise.
(L_interwork_call_via_rX): Test for __ARM_ARCH_ISA_ARM rather than
__ARM_ARCH_6M__ in guard for checking whether it is defined.
(final includes): Test for NOT_ISA_TARGET_32BIT rather than
__ARM_ARCH_6M__ and add comment to indicate the connection between
this condition and the one in gcc/config/arm/elf.h.
* config/arm/libunwind.S: Test for __ARM_ARCH_ISA_THUMB and
__ARM_ARCH_ISA_ARM rather than __ARM_ARCH_6M__.
* config/arm/t-softfp: Likewise.

Best regards,

Thomasdiff --git a/gcc/config/arm/elf.h b/gcc/config/arm/elf.h
index 77f30554d5286bd83aeab0c8dc308cfd44e732dc..246de5492665ba2a0292736a9c53fbaaef184d72 100644
--- a/gcc/config/arm/elf.h
+++ b/gcc/config/arm/elf.h
@@ -148,8 +148,9 @@
   while (0)

 /* Horrible hack: We want to prevent some libgcc routines being included
-   for some multilibs.  */
-#ifndef __ARM_ARCH_6M__
+   for some multilibs.  The condition should match the one in
+   libgcc/config/arm/lib1funcs.S.  */
+#if __ARM_ARCH_ISA_ARM || __ARM_ARCH_ISA_THUMB != 1
 #undef L_fixdfsi
 #undef L_fixunsdfsi
 #undef L_truncdfsf2
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 04ca17656f2f26dda710e8a0f9ca77dd963ab39b..38151375c29cd007f1cc34ead3aa495606224061 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -3320,10 +3320,8 @@ proc check_effective_target_arm_cortex_m { } {
 	return 0
 }
 return [check_no_compiler_messages arm_cortex_m assembly {
-	#if !defined(__ARM_ARCH_7M__) \
-&& !defined (__ARM_ARCH_7EM__) \
-&& !defined (__ARM_ARCH_6M__)
-	#error !__ARM_ARCH_7M__ && !__ARM_ARCH_7EM__ && !__ARM_ARCH_6M__
+	#if defined(__ARM_ARCH_ISA_ARM)
+	#error __ARM_ARCH_ISA_ARM is defined
 	#endif
 	int i;
 } "-mthumb"]
diff --git a/libgcc/config/arm/bpabi-v6m.S b/libgcc/config/arm/bpabi-v6m.S
index 5d35aa6afca224613c94cf923f8a2ee8dac949f2..27f33a4e8ced2cb2da8e38f5d78501954ee7363b 100644
--- a/libgcc/config/arm/bpabi-v6m.S
+++ b/libgcc/config/arm/bpabi-v6m.S
@@ -1,4 +1,5 @@
-/* Miscellaneous BPABI functions.  ARMv6M implementation
+/* Miscellaneous BPABI functions.  Thumb-1 implementation, suitable for ARMv4T,
+   ARMv6-M and ARMv8-M Baseline like ISA variants.

Copyright (C) 2006-2016 Free Software Foundation, Inc.
Contributed by CodeSourcery.
diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S
index 375a5135110895faa44267ebee045fd315515027..951dcda1c3bf7f323423a3e2813bdf0501653016 100644
--- a/libgcc/config/arm/lib1funcs.S
+++ b/libgcc/config/arm/lib1funcs.S
@@ -124,10 +124,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
  && !defined(__thumb2__)		\
  && (!defined(__THUMB_INTERWORK__)	\
 	 || defined

[Bug c++/71143] [7 Regression] bogus error: ‘A’ is not a base of ‘B< >’

2016-06-17 Thread vp at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71143

--- Comment #7 from Vidya Praveen  ---
Author: jason
Date: Fri Jun 17 16:35:33 2016
New Revision: 237558

URL: https://gcc.gnu.org/viewcvs?rev=237558=gcc=rev
Log:
PR c++/71209 - wrong error with dependent base

* typeck.c (finish_class_member_access_expr): Avoid "not a base"
warning when there are dependent bases.

Added:
trunk/gcc/testsuite/g++.dg/template/dependent-base1.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/typeck.c

[Bug tree-optimization/71550] [7 Regression] wrong code at -O3 on x86_64-linux-gnu

2016-06-17 Thread law at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71550

Jeffrey A. Law  changed:

   What|Removed |Added

 CC||law at redhat dot com
   Assignee|unassigned at gcc dot gnu.org  |law at redhat dot com

--- Comment #3 from Jeffrey A. Law  ---
I suspect this is a check that exists for the old style threader, but which is
not in the backwards threader.  Namely that when threading through a loop
header, we have to verify that the final block in the path dominates the latch.

[Bug c/71560] union compound literal initializes wrong union field

2016-06-17 Thread msebor at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71560

Martin Sebor  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-06-17
 CC||msebor at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Martin Sebor  ---
Confirmed as a documentation bug.  The quoted sentence:

  Compound literals for scalar types and union types are also allowed, but then
the compound literal is equivalent to a cast.

is incorrect because a compound literal is an lvalue while a cast yields an
rvalue.  So for example:

  int i;
  i = ++(int){ 123.4 };

is valid and initializes i to 124, while

  i = ++(int)123;

is not valid.  This reflected in footnote 99 in C11 which clarifies the
semantics of compound literals by saying:

99) Note that this differs from a cast expression.  For example, a cast
specifies a conversion to scalar types or void only, and the result of a cast
expression is not an lvalue.

[Bug c++/71569] New: [5/6] Crash: External definition of template member from template struct

2016-06-17 Thread oliver.tale at web dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71569

Bug ID: 71569
   Summary: [5/6] Crash: External definition of template member
from template struct
   Product: gcc
   Version: 5.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: oliver.tale at web dot de
  Target Milestone: ---

Created attachment 38718
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38718=edit
gcc -v -save-temps crash.cpp

Check out this:


template 
class Foo
{
template 
static bool Bar;
};

template
template
bool Foo::Bar;

int main()
{ }


I know that its not valid code, because <= 4.9.3 report the following:

"5 : error: template declaration of 'bool Bar'
static bool Bar;
^
10 : error: expected initializer before '<' token
bool Foo::Bar;
^"

But since 5.1 and up to 6.1.1 - it just crashes:

"crash.cpp:10:14: internal compiler error: in determine_specialization, at
cp/pt.c:2075
bool Foo::Bar;"
See the attachment for stack trace.

I couldnt get my hand on a 4.9.4 or 5.0 or something newer than 6.1.1, maybe
someone can track the exact version jump.
Must be between 4.9.3 and 5.1.

[Bug c++/71143] [7 Regression] bogus error: ‘A’ is not a base of ‘B< >’

2016-06-17 Thread trippels at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71143

Markus Trippelsdorf  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Markus Trippelsdorf  ---
fixed

Re: [PATCH] Add port for Phoenix-RTOS on ARM platform.

2016-06-17 Thread Jeff Law


On 06/17/2016 07:07 AM, Jakub Sejdak wrote:

So at least in the immediate term let's get you write privileges so you can
commit approved changes and on the path towards maintaining the Phoenix-RTOS
configurations.


Do I have to apply for this permission somewhere? Provided page states
only, that it has to be granted by an existing maintainer.

Yes, there's a link to this form:

https://sourceware.org/cgi-bin/pdw/ps_form.cgi

List my email address (l...@redhat.com) as approving your request for 
write access.


jeff

Re: [PATCH] Fix memory leak in tree-ssa-reassoc.c

2016-06-17 Thread Jeff Law


On 06/17/2016 07:14 AM, Martin Liška wrote:

Hi.

Following simple patch fixes a newly introduced memory leak.

Patch survives regression tests and bootstraps on x86_64-linux.

Ready from trunk?
Thanks,
Martin


0001-Fix-memory-leak-in-tree-ssa-reassoc.c.patch


From a2e6be16d7079b744db4d383b8317226ab53ff58 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 17 Jun 2016 12:26:58 +0200
Subject: [PATCH] Fix memory leak in tree-ssa-reassoc.c

gcc/ChangeLog:

2016-06-17  Martin Liska  

* tree-ssa-reassoc.c (transform_add_to_multiply): Use auto_vec.

OK.

And more generally, conversion from vec to auto_vec to fix memory leaks 
or eliminate explicit memory management is pre-approved.


Jeff

Re: [PATCH, vec-tails 03/10] Support epilogues vectorization with no masking

2016-06-17 Thread Jeff Law


On 06/17/2016 08:33 AM, Ilya Enkovich wrote:


Hmm, there seems to be a level of indirection I'm missing here.  We're
smuggling LOOP_VINFO_ORIG_LOOP_INFO around in loop->aux.  Ewww.  I thought
the whole point of LOOP_VINFO_ORIG_LOOP_INFO was to smuggle the VINFO from
the original loop to the vectorized epilogue.  What am I missing?  Rather
than smuggling around in the aux field, is there some inherent reason why we
can't just copy the info from the original loop directly into
LOOP_VINFO_ORIG_LOOP_INFO for the vectorized epilogue?


LOOP_VINFO_ORIG_LOOP_INFO is used for several things:
 - mark this loop as epilogue
 - get VF of original loop (required for both mask and nomask modes)
 - get decision about epilogue masking

That's all.  When epilogue is created it has no LOOP_VINFO.  Also when we
vectorize loop we create and destroy its LOOP_VINFO multiple times.  When
loop has LOOP_VINFO loop->aux points to it and original LOOP_VINFO is in
LOOP_VINFO_ORIG_LOOP_INFO.  When Loop has no LOOP_VINFO associated I have no
place to bind it with the original loop and therefore I use vacant loop->aux
for that.  Any other way to bind epilogue with its original loop would work
as well.  I just chose loop->aux to avoid new fields and data structures.
I was starting to draw the conclusion that the smuggling in the aux 
field was for cases when there was no LOOP_VINFO.  But was rather late 
at night and I didn't follow that idea through the code.  THanks for 
clarifying.





And something just occurred to me -- is there some inherent reason why SLP
doesn't vectorize the epilogue, particularly for the cases where we can
vectorize the epilogue using smaller vectors?  Sorry if you've already
answered this somewhere or it's a dumb question.


IIUC this may happen only if we unroll epilogue into a single BB which happens
only when epilogue iterations count is known. Right?
Probably.  The need to make sure the epilogue is unrolled probably makes 
this a non-starter.


I have a soft spot for SLP as I stumbled on the idea while rewriting a 
presentation in the wee hours of the morning for the next day. 
Essentially it was a "poor man's" vectorizer that could be done for 
dramatically less engineering cost than a traditional vectorizer.  The 
MIT paper outlining the same ideas came out a couple years later...




+   /* Add new loop to a processing queue.  To make it easier

+  to match loop and its epilogue vectorization in dumps
+  put new loop as the next loop to process.  */
+   if (new_loop)
+ {
+   loops.safe_insert (i + 1, new_loop->num);
+   vect_loops_num = number_of_loops (cfun);
+ }
+


So just to be clear, the only reason to do this is for dumps -- other than
processing the loop before it's epilogue, there's no other inherently
necessary ordering of the loops, right?


Right, I don't see other reasons to do it.

Perfect.  Thanks for confirming.

jeff

[Bug c++/71209] [c++] erroneous 'is not a base class of' error

2016-06-17 Thread jason at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71209

--- Comment #2 from Jason Merrill  ---
Author: jason
Date: Fri Jun 17 16:35:33 2016
New Revision: 237558

URL: https://gcc.gnu.org/viewcvs?rev=237558=gcc=rev
Log:
PR c++/71209 - wrong error with dependent base

* typeck.c (finish_class_member_access_expr): Avoid "not a base"
warning when there are dependent bases.

Added:
trunk/gcc/testsuite/g++.dg/template/dependent-base1.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/typeck.c

C++ PATCH for c++/71143, 71209 (bogus error with dependent base)

2016-06-17 Thread Jason Merrill

Now that we have stopped treating *this as a dependent scope, we need
to avoid giving errors for not finding things when we have dependent
bases.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit d553bc7ff104a8d973c3f48c005457038422db26
Author: Jason Merrill 
Date:   Fri Jun 17 12:16:00 2016 -0400

PR c++/71209 - wrong error with dependent base

* typeck.c (finish_class_member_access_expr): Avoid "not a base"
warning when there are dependent bases.

diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 2ccd2da..3704b88 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -2797,6 +2797,8 @@ finish_class_member_access_expr (cp_expr object, tree 
name, bool template_p,
return error_mark_node;
  if (!access_path)
{
+ if (any_dependent_bases_p (object_type))
+   goto dependent;
  if (complain & tf_error)
error ("%qT is not a base of %qT", scope, object_type);
  return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/template/dependent-base1.C 
b/gcc/testsuite/g++.dg/template/dependent-base1.C
new file mode 100644
index 000..392305b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/dependent-base1.C
@@ -0,0 +1,10 @@
+// PR c++/71209
+
+struct A {
+  int table_clear;
+};
+
+template 
+struct B : T {
+  B() { this->A::table_clear; }
+};

Re: RFC: pass to warn on questionable uses of alloca().

2016-06-17 Thread Jeff Law


On 06/16/2016 02:32 AM, Aldy Hernandez wrote:

Hi folks!

I've been working on a plugin to warn on unbounded uses of alloca() to
help find questionable uses in glibc and other libraries.  It occurred
to me that the broader community could benefit from it, as it has found
quite a few interesting cases. So, I've reimplemented it as an actual
pass, lest it be lost in plugin la-la land and bit-rot.

And just to provide more background.

In my time caretaking glibc for Red Hat, unbound allocas were the single 
most commonly exploited problem in glibc.  They can be used for stack 
shifting or under-allocating objects which in turn allow for the bad 
guys to start scribbling data into memory at locations under the 
attacker's control.


In fact, I saw this enough that I'm of the opinion that we as developers 
simply aren't capable of using alloca correctly and that its explicit 
use ought to be banned by policy.  Anyway.





Before I sink any more time cleaning it up, would this be something
acceptable into the compiler?  It doesn't have anything glibc specific,
except possibly the following idiom which I allow:

I strongly believe it ought to be cleaned up and brought into GCC.



p.s. The pass currently warns on all uses of VLAs.  I'm not completely
sold on this idea, so perhaps we could remove it, or gate it with a flag.
An VLA where the size is under attacker control is no different than an 
unbound or overflowing alloca.  Negative sizes in particular are easy to 
exploit, though the same effect can be achieved overflowing the actual 
size computation.


So I think this problem turns into whether or not we can see the size of 
the allocated object and use that to guide warning.  This also 
introduces the idea of somehow marking objects which are under user 
control (and propagating that property) and using that to help guide 
analysis.


Jeff

[Bug c++/71143] [7 Regression] bogus error: ‘A’ is not a base of ‘B< >’

2016-06-17 Thread jason at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71143

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

--- Comment #5 from Jason Merrill  ---
This testcase is ill-formed, no diagnostic required, but the 71209 testcase is
well-formed, so let's use that one instead:

class A {
  int table_clear;
};

template 
class B : T {
  B() { this->A::table_clear; }
};

Re: i386/prologues: ROP mitigation for normal function epilogues

2016-06-17 Thread Jeff Law


On 06/17/2016 08:29 AM, Michael Matz wrote:

Hi,

On Fri, 17 Jun 2016, Bernd Schmidt wrote:


On 06/17/2016 04:03 PM, Michael Matz wrote:

But does this really improve something?  Essentially you're replacing

   0xc9 0xc3 

(the end of a function containing "leave;ret") with

   0xe9  

where the four random bytes are different for each rewritten function
return (but correlated as they differ exactly by their position
difference).

I'm not sure why the latter sequence is better?


I think I'm missing what you're trying to say. The latter sequence does not
contain a return opcode hence it ought to be better?


The "0xe9 " essentially is the leave+return opcode,
after all it jumps to them (let's ignore the possibility that the jump
target address might contain a 0xc3 byte).  So if the attacker finds some
interesting gadget in  I don't see how the change from
leave+ret to jump-to-leave+ret changes anything from a threat avoidance
perspective.  It's fully possible that I don't understand the threat
vector of ROP correctly, in which case I'd also like to know :)

A couple things to note.

I expect that we'll be doing work in the assembler and linker to address 
cases where 0xc3 shows up in immediate displacements, absolute addresses 
and the like.  The easiest ones are when 0xc3 shows up as a byte 
displacement for pc-relative jumps, but there are others.


I haven't looked at the random bytes stuff Bernd has done, but its 
likely to ensure that the bad guys can't jump into the middle of an 
instruction prior to the leave and use that to skip the leave.


Jeff

Re: [PATCH, vec-tails 03/10] Support epilogues vectorization with no masking

2016-06-17 Thread Bin.Cheng

On Fri, Jun 17, 2016 at 4:37 PM, Jeff Law  wrote:
> On 06/17/2016 08:48 AM, Bin.Cheng wrote:


> +  /* FORNOW: Currently alias checks are not inherited for epilogues.
> + Don't try to vectorize epilogue because it will require
> + additional alias checks.  */


 Are the alias checks here redundant with the ones done for the original
 loop?  If so won't DOM eliminate them?
>>>
>>>
>>> I revisited this part recently and thought it should actually be safe to
>>> assume we have no aliasing in epilogue because we are dominated by alias
>>> checks of the original loop.  So I prepared a patch to remove this
>>> restriction
>>> and avoid alias checks generation for epilogues (so we compute aliases
>>> checks
>>> required but don't emit them).  I didn't send this patch yet.
>>> Do you think it is a valid assumption?
>>
>> I recently visited that part and agree it's valid, unless epilogue
>> loop is vectorized in larger vector-units, but that would be unlikely
>> to happen, right?  BTW, does this patch start all over analyzing
>> epilogue loop?  As you said the alias checks will be computed.
>
> I think we're OK either way.  If you emit the checks, DOM ought to eliminate
> them as they'd be dominated by the earlier check.
Unfortunately DOM probably can't.  Especially constant offsets are
folded deep in expressions and they could be different under smaller
vector-units.  Even it can, it will introduce long live range since
check result will be combined with some others.  Not sure if all
checks can be avoided, alignment checks should be ok too?

Thanks,
bin
>
> But I'm a fan of not generating dumb code for later passes to clean up, so I
> think we should just avoid generating the additional checks if we can
> reasonably do so in the vectorizer.
>
> I can't envision a scenario where we'd want a larger vector size in the
> epilogue than the main loop.
>
> Jeff
>

Re: i386/prologues: ROP mitigation for normal function epilogues

2016-06-17 Thread Jeff Law


On 06/17/2016 04:06 AM, Bernd Schmidt wrote:

This is another step to flesh out -mmitigate-rop for i386 a little more.
The basic idea was (I think) Richard Henderson's: if we could arrange to
have every return preceded by a leave instruction, it would make it
harder to construct an attack since it takes away a certain amount of
control over the stack pointer. I extended this to move the leave/ret
pair to libgcc, preceded by a sequence of nops, so as to take away the
possibility of jumping into the middle of an instruction preceding the
leave/ret pair and thereby skipping the leave.
I don't think anyone on our team can take credit for the idea.  We found 
that folks working in this space were calling out leave;ret as being 
harder to exploit.


The key being that to use leave;ret they have to control the frame 
pointer and the saved return address.  Typically they have control of 
just the saved return address.




This has a performance impact when -mmitigate-rop is enabled, I made
some measurements a while ago and it looks like it's about twice the
impact of -fno-omit-frame-pointer.
Right.  My idea is to use this mitigation for functions which aren't 
protected by SSP (fixing the SSP epilogues is a distinct project, 
Florian should have some details on what we need to do to make those 
difficult to attack).  So we're not paying the cost on every function, 
just those which aren't protected by SSP.


Jeff

Re: [PATCH, vec-tails 03/10] Support epilogues vectorization with no masking

2016-06-17 Thread Jeff Law


On 06/17/2016 08:16 AM, Ilya Enkovich wrote:


I do think you've got a legitimate question though.   Ilya, can you give any
insights here based on your KNL and Haswell testing or data/insights from
the LLVM and/or ICC teams?


I have no information about LLVM.  As I said in other thread ICC uses all
options (masked epilogue, combined loop, vectorized epilogue with smaller
vector size).  It also may generate different versions (e.g. combined and
with masked epilogue) and choose dynamically depending on iterations count.
Any guidance from the ICC team on the costing model to choose between 
the different approaches?


I'm a bit surprised that there's enough value in doing this much work to 
vectorize the epilogue, but that appears to be the case...


jeff

Re: [PATCH, vec-tails 03/10] Support epilogues vectorization with no masking

2016-06-17 Thread Jeff Law


On 06/17/2016 08:48 AM, Bin.Cheng wrote:



+  /* FORNOW: Currently alias checks are not inherited for epilogues.
+ Don't try to vectorize epilogue because it will require
+ additional alias checks.  */


Are the alias checks here redundant with the ones done for the original
loop?  If so won't DOM eliminate them?


I revisited this part recently and thought it should actually be safe to
assume we have no aliasing in epilogue because we are dominated by alias
checks of the original loop.  So I prepared a patch to remove this restriction
and avoid alias checks generation for epilogues (so we compute aliases checks
required but don't emit them).  I didn't send this patch yet.
Do you think it is a valid assumption?

I recently visited that part and agree it's valid, unless epilogue
loop is vectorized in larger vector-units, but that would be unlikely
to happen, right?  BTW, does this patch start all over analyzing
epilogue loop?  As you said the alias checks will be computed.
I think we're OK either way.  If you emit the checks, DOM ought to 
eliminate them as they'd be dominated by the earlier check.


But I'm a fan of not generating dumb code for later passes to clean up, 
so I think we should just avoid generating the additional checks if we 
can reasonably do so in the vectorizer.


I can't envision a scenario where we'd want a larger vector size in the 
epilogue than the main loop.


Jeff

[Bug libgcc/71559] ICE in ix86_fp_cmp_code_to_pcmp_immediate, at config/i386/i386.c:23042 (KNL/AVX512)

2016-06-17 Thread jakub at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71559

--- Comment #7 from Jakub Jelinek  ---
I've created the table just by walking through the 'D' handling.
That said, looking e.g. at
https://people.eecs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF
for the C comparisons (EQ, NE, LT, LE, GT, GE) EQ and NE should not raise
FE_INVALID, while LT, LE, GT, GE should.
And, if one or both operands are NaNs, then NE must be true, while EQ, LT, LE,
GT, GE false.
So, at least for these, the 'D' numbers in the table are right, i.e.
EQ 0, NE 4, GT 0xe, GE 0xd, LT 1, LE 2.  Let me investigate the other
comparison codes.

Setting alias set and vuse/vdef on gimple statements

2016-06-17 Thread Kyrill Tkachov


Hi all,

I'm working on a tree-ssa pass to implement PR 22141, a pass that merges 
adjacent stores.
I've gotten to the point where I can identify the adjacent accesses, merge them 
into a single value
and am now working on emitting the new statements but, as I don't have a lot of 
experience with the gimple
machinery, am not sure what to do about alias sets and other bookkeeping.

At the point where I'm emitting the single wide store to replace a number of 
narrow consecutive stores
I construct a MEM_REF that I assign the wide merged value to.
I think I need to also set its alias info but am not sure how to construct it.

Conceptually I need the disjunction of all the alias sets of the stores that 
the new store replaces
but I'm not sure how to get that. I can get the alias set of a single gimple 
statement through
get_alias_set of the LHS of each gimple assignment but how do I merge them?
I don't see a helper function for that that springs to mind...

Also, from what I understand gimple statements that write to memory have these 
vdef operands but I'm
not sure what the vdef operand for the new store that replaces the series of 
adjacent stores should be
set to (or how to construct it).

Any guidance on this would be very appreciated.

Thanks,
Kyrill

[Bug libstdc++/71545] [6/7 Regression] Incorrect irreflexive comparison debug check in std::lower_bound

2016-06-17 Thread redi at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71545

--- Comment #2 from Jonathan Wakely  ---
The irreflexive assertion is incorrect for lexicographical compare too:

#include 

struct X { };

bool operator<(X, int) { return true; }
bool operator<(int, X) { return false; }

// Not a strict weak order
bool operator<(X, X) { return true; }

int main()
{
  X x[1];
  int i[1];
  std::lexicographical_compare(x, x+1, i, i+1);
}

This fails in Debug Mode because operator<(X, X) doesn't define a strict weak
order, but that operator is not used by the algorithm.

[Bug libgcc/71559] ICE in ix86_fp_cmp_code_to_pcmp_immediate, at config/i386/i386.c:23042 (KNL/AVX512)

2016-06-17 Thread tripiana at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71559

--- Comment #6 from Carlos Tripiana Montes  ---
Hope you guys can do that and fix this problem soon. It would much appreciated
as we are (at Barcelona Supercomputing Center, Spain) doing research on KNL and
we need the fastest/most robust code we can obtain.

Thanks in advance.

(In reply to Ilya Enkovich from comment #5)
> (In reply to Jakub Jelinek from comment #3)
> > I'd say it is a bug in ix86_fp_cmp_code_to_pcmp_immediate that it handles
> > only a small portion of the FP comparison codes, while VCMPP[SD]
> > instructions should be able to handle everything needed.
> > But, I'm also surprised where the values in that function come from.
> > Looking at the D modifier expansion in i386.c that is used for AVX vcmp, I
> > see that:
> > code  %D3 emits corresponding imm   
> > ix86_fp_cmp_code_to_pcmp_immediate
> > UNEQ  eq_us 0x18 ICE
> > EQeq08
> > UNLT  nge   9ICE
> > LTlt10x19
> > UNLE  ngt   0xa  ICE
> > LEle20x1a
> > UNORDERED unord 3ICE
> > LTGT  neq_oq0xc  ICE
> > NEneq   44
> > GEge0xd  0x15
> > UNGE  nlt   5ICE
> > GTgt0xe  0x16
> > UNGT  nle   6ICE
> > ORDERED   ord   7ICE
> > So, there is agreement only on NE and nothing else.
> 
> I took values for ix86_fp_cmp_code_to_pcmp_immediate from some table Kirill
> gave me and seems all values I used were unordered non-signaling variants. 
> I don't fully understand the logic of imms in this table.  Some of them are
> signalling and some of them are not.  Also why is NE unordered?  But I
> suspect there is good reason for such choice and suppose AVX512 compares
> should be put in a consistent state with AVX ones by fixing
> ix86_fp_cmp_code_to_pcmp_immediate appropriately.

[Bug c++/71568] New: Inexplicable error: "X is inaccessible within this context" for a public member

2016-06-17 Thread richardg.work at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71568

Bug ID: 71568
   Summary: Inexplicable error: "X is inaccessible within this
context" for a public member
   Product: gcc
   Version: 6.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: richardg.work at gmail dot com
  Target Milestone: ---

Created attachment 38717
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38717=edit
Demonstration of the error & a possible workaround

The attached 90 line test.cpp fails to compile with an inexplicable error
message :  "X is inaccessible within this context".  The X is clearly a public
member, so this is not a C++ private memeber issue.  Critically, the same code
compiles without issue on the clang compiler (clang 3.7.1)

Note : Compile test.cpp with -std=c++14.

I've seen this error in unrelated template code before, but this time I've been
able to simplify it to a 90 line repro.  

Godbolt demonstrating the error : https://godbolt.org/g/Z4hDez

/tmp/gcc-explorer-compiler116517-85-12v5300/example.cpp: In instantiation of
'struct has_nlog_custom':
35 : required by substitution of 'template typename
std::enable_if::type write(NLogger&, const
Head&) [with Head = std::tuple]'
70 : required from 'typename std::enable_if<(!
has_nlog_custom::value)>::type log(NLogger&, const Head&) [with Head =
LogStructN; typename std::enable_if<(! has_nlog_custom::value)>::type =
void]'
88 : required from here
21 : error: 'void Foo0::nlog_custom(NLogger&) const' is inaccessible within
this context
struct has_nlog_custom> : std::true_type
{};
^~
78 : note: declared here
void nlog_custom(NLogger & nl) const {}
^~~
21 : error: 'void Foo0::nlog_custom(NLogger&) const' is inaccessible within
this context
struct has_nlog_custom> : std::true_type
{};
^~
78 : note: declared here
void nlog_custom(NLogger & nl) const {}
^~~
Compilation failed


I have a workaround, changing the #if between 0 & 1 toggles the error.  But I
don't believe that I've found the root cause of the issue.  It would be very
helpful to either diagnose this as a gcc bug, or have gcc provide more
diagnostic information on the root cause.

Thanks

[Bug libgcc/71559] ICE in ix86_fp_cmp_code_to_pcmp_immediate, at config/i386/i386.c:23042 (KNL/AVX512)

2016-06-17 Thread ienkovich at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71559

Ilya Enkovich  changed:

   What|Removed |Added

 CC||ienkovich at gcc dot gnu.org

--- Comment #5 from Ilya Enkovich  ---
(In reply to Jakub Jelinek from comment #3)
> I'd say it is a bug in ix86_fp_cmp_code_to_pcmp_immediate that it handles
> only a small portion of the FP comparison codes, while VCMPP[SD]
> instructions should be able to handle everything needed.
> But, I'm also surprised where the values in that function come from.
> Looking at the D modifier expansion in i386.c that is used for AVX vcmp, I
> see that:
> code  %D3 emits corresponding imm   
> ix86_fp_cmp_code_to_pcmp_immediate
> UNEQ  eq_us 0x18 ICE
> EQeq08
> UNLT  nge   9ICE
> LTlt10x19
> UNLE  ngt   0xa  ICE
> LEle20x1a
> UNORDERED unord 3ICE
> LTGT  neq_oq0xc  ICE
> NEneq   44
> GEge0xd  0x15
> UNGE  nlt   5ICE
> GTgt0xe  0x16
> UNGT  nle   6ICE
> ORDERED   ord   7ICE
> So, there is agreement only on NE and nothing else.

I took values for ix86_fp_cmp_code_to_pcmp_immediate from some table Kirill
gave me and seems all values I used were unordered non-signaling variants.  I
don't fully understand the logic of imms in this table.  Some of them are
signalling and some of them are not.  Also why is NE unordered?  But I suspect
there is good reason for such choice and suppose AVX512 compares should be put
in a consistent state with AVX ones by fixing
ix86_fp_cmp_code_to_pcmp_immediate appropriately.

Re: [DOC PATCH] Rewrite docs for inline asm

2016-06-17 Thread Andrew Haley

On 04/04/14 20:48, dw wrote:
> I do not have write permissions to check this patch in.

We must fix that.

Andrew.

Re: [PATCH, vec-tails 03/10] Support epilogues vectorization with no masking

2016-06-17 Thread Ilya Enkovich

2016-06-17 17:48 GMT+03:00 Bin.Cheng :
> On Fri, Jun 17, 2016 at 3:33 PM, Ilya Enkovich  wrote:
>> 2016-06-16 9:00 GMT+03:00 Jeff Law :
>>> On 05/19/2016 01:39 PM, Ilya Enkovich wrote:

 Hi,

 This patch introduces changes required to run vectorizer on loop epilogue.
 This also enables epilogue vectorization using a vector of smaller size.

 Thanks,
 Ilya
 --
 gcc/

 2016-05-19  Ilya Enkovich  

 * tree-if-conv.c (tree_if_conversion): Make public.
 * tree-if-conv.h: New file.
 * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Don't
 try to enhance alignment for epilogues.
 * tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Return
 created loop.
 * tree-vect-loop.c: include tree-if-conv.h.
 (destroy_loop_vec_info): Preserve LOOP_VINFO_ORIG_LOOP_INFO in
 loop->aux.
 (vect_analyze_loop_form): Init LOOP_VINFO_ORIG_LOOP_INFO and reset
 loop->aux.
 (vect_analyze_loop): Reset loop->aux.
 (vect_transform_loop): Check if created epilogue should be
 returned
 for further vectorization.  If-convert epilogue if required.
 * tree-vectorizer.c (vectorize_loops): Add a queue of loops to
 process and insert vectorized loop epilogues into this queue.
 * tree-vectorizer.h (vect_do_peeling_for_loop_bound): Return
 created
 loop.
 (vect_transform_loop): Return created loop.
>>>
>>> As Richi noted, the additional calls into the if-converter are unfortunate.
>>> I'm not sure how else to avoid them though.  It looks like we can run
>>> if-conversion on just the epilogue, so maybe that's not too bad.
>>>
>>>
 @@ -1212,8 +1213,8 @@ destroy_loop_vec_info (loop_vec_info loop_vinfo,
 bool clean_stmts)
destroy_cost_data (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo));
loop_vinfo->scalar_cost_vec.release ();

 +  loop->aux = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo);
free (loop_vinfo);
 -  loop->aux = NULL;
  }
>>>
>>> Hmm, there seems to be a level of indirection I'm missing here.  We're
>>> smuggling LOOP_VINFO_ORIG_LOOP_INFO around in loop->aux.  Ewww.  I thought
>>> the whole point of LOOP_VINFO_ORIG_LOOP_INFO was to smuggle the VINFO from
>>> the original loop to the vectorized epilogue.  What am I missing?  Rather
>>> than smuggling around in the aux field, is there some inherent reason why we
>>> can't just copy the info from the original loop directly into
>>> LOOP_VINFO_ORIG_LOOP_INFO for the vectorized epilogue?
>>
>> LOOP_VINFO_ORIG_LOOP_INFO is used for several things:
>>  - mark this loop as epilogue
>>  - get VF of original loop (required for both mask and nomask modes)
>>  - get decision about epilogue masking
>>
>> That's all.  When epilogue is created it has no LOOP_VINFO.  Also when we
>> vectorize loop we create and destroy its LOOP_VINFO multiple times.  When
>> loop has LOOP_VINFO loop->aux points to it and original LOOP_VINFO is in
>> LOOP_VINFO_ORIG_LOOP_INFO.  When Loop has no LOOP_VINFO associated I have no
>> place to bind it with the original loop and therefore I use vacant loop->aux
>> for that.  Any other way to bind epilogue with its original loop would work
>> as well.  I just chose loop->aux to avoid new fields and data structures.
>>
>>>
 +  /* FORNOW: Currently alias checks are not inherited for epilogues.
 + Don't try to vectorize epilogue because it will require
 + additional alias checks.  */
>>>
>>> Are the alias checks here redundant with the ones done for the original
>>> loop?  If so won't DOM eliminate them?
>>
>> I revisited this part recently and thought it should actually be safe to
>> assume we have no aliasing in epilogue because we are dominated by alias
>> checks of the original loop.  So I prepared a patch to remove this 
>> restriction
>> and avoid alias checks generation for epilogues (so we compute aliases checks
>> required but don't emit them).  I didn't send this patch yet.
>> Do you think it is a valid assumption?
> I recently visited that part and agree it's valid, unless epilogue
> loop is vectorized in larger vector-units, but that would be unlikely
> to happen, right?  BTW, does this patch start all over analyzing
> epilogue loop?  As you said the alias checks will be computed.

Original loop is vectorized for the max possible vector size and we can't
(and don't want to) choose a bigger one.

We don't preserve any info for epilogue.  Actually even when we try various
vector sizes for a single loop we recompute everything for each vector size.

Thanks,
Ilya

>
> Thanks,
> bin
>>
>>>
>>>
>>> And something just occurred to me -- is there some inherent reason why SLP
>>> doesn't vectorize the epilogue, particularly for the cases

Re: [Patch AArch64] Fixup to fcvt patterns added in r237200

2016-06-17 Thread Christophe Lyon

On 17 June 2016 at 16:44, James Greenhalgh  wrote:
> On Fri, Jun 17, 2016 at 04:25:31PM +0200, Christophe Lyon wrote:
>> On 10 June 2016 at 14:29, James Greenhalgh  wrote:
>> >
>> > Hi,
>> >
>> > My autotester picked up some issues with the vcvt{ds}_n_* intrinsics
>> > added in r237200.
>> >
>> Hi,
>>
>> What tests does your autotester perform? I haven't noticed these
>> problems when running the GCC testsuite on the usual aarch64
>> targets. I'm interested in increasing coverage, if doable.
>
> Hi Christophe,
>
> I think we've spoken about this before [1], but the autotester is using
> an internal testsuite that is not feasible to share upstream.

Ha, indeed. I wasn't sure you were referring to the same tests.

>
> To see the sorts of tests that we're running, have a look at the LLVM
> testsuite. If the layout of the testsuite hasn't changed since I last
> looked, you should be able to find an example at:
>
> 
> /SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c
>
> Hope that helps,
> James
>
> [1]: https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00775.html
>

Re: [PATCH, vec-tails 03/10] Support epilogues vectorization with no masking

2016-06-17 Thread Bin.Cheng

On Fri, Jun 17, 2016 at 3:33 PM, Ilya Enkovich  wrote:
> 2016-06-16 9:00 GMT+03:00 Jeff Law :
>> On 05/19/2016 01:39 PM, Ilya Enkovich wrote:
>>>
>>> Hi,
>>>
>>> This patch introduces changes required to run vectorizer on loop epilogue.
>>> This also enables epilogue vectorization using a vector of smaller size.
>>>
>>> Thanks,
>>> Ilya
>>> --
>>> gcc/
>>>
>>> 2016-05-19  Ilya Enkovich  
>>>
>>> * tree-if-conv.c (tree_if_conversion): Make public.
>>> * tree-if-conv.h: New file.
>>> * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Don't
>>> try to enhance alignment for epilogues.
>>> * tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Return
>>> created loop.
>>> * tree-vect-loop.c: include tree-if-conv.h.
>>> (destroy_loop_vec_info): Preserve LOOP_VINFO_ORIG_LOOP_INFO in
>>> loop->aux.
>>> (vect_analyze_loop_form): Init LOOP_VINFO_ORIG_LOOP_INFO and reset
>>> loop->aux.
>>> (vect_analyze_loop): Reset loop->aux.
>>> (vect_transform_loop): Check if created epilogue should be
>>> returned
>>> for further vectorization.  If-convert epilogue if required.
>>> * tree-vectorizer.c (vectorize_loops): Add a queue of loops to
>>> process and insert vectorized loop epilogues into this queue.
>>> * tree-vectorizer.h (vect_do_peeling_for_loop_bound): Return
>>> created
>>> loop.
>>> (vect_transform_loop): Return created loop.
>>
>> As Richi noted, the additional calls into the if-converter are unfortunate.
>> I'm not sure how else to avoid them though.  It looks like we can run
>> if-conversion on just the epilogue, so maybe that's not too bad.
>>
>>
>>> @@ -1212,8 +1213,8 @@ destroy_loop_vec_info (loop_vec_info loop_vinfo,
>>> bool clean_stmts)
>>>destroy_cost_data (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo));
>>>loop_vinfo->scalar_cost_vec.release ();
>>>
>>> +  loop->aux = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo);
>>>free (loop_vinfo);
>>> -  loop->aux = NULL;
>>>  }
>>
>> Hmm, there seems to be a level of indirection I'm missing here.  We're
>> smuggling LOOP_VINFO_ORIG_LOOP_INFO around in loop->aux.  Ewww.  I thought
>> the whole point of LOOP_VINFO_ORIG_LOOP_INFO was to smuggle the VINFO from
>> the original loop to the vectorized epilogue.  What am I missing?  Rather
>> than smuggling around in the aux field, is there some inherent reason why we
>> can't just copy the info from the original loop directly into
>> LOOP_VINFO_ORIG_LOOP_INFO for the vectorized epilogue?
>
> LOOP_VINFO_ORIG_LOOP_INFO is used for several things:
>  - mark this loop as epilogue
>  - get VF of original loop (required for both mask and nomask modes)
>  - get decision about epilogue masking
>
> That's all.  When epilogue is created it has no LOOP_VINFO.  Also when we
> vectorize loop we create and destroy its LOOP_VINFO multiple times.  When
> loop has LOOP_VINFO loop->aux points to it and original LOOP_VINFO is in
> LOOP_VINFO_ORIG_LOOP_INFO.  When Loop has no LOOP_VINFO associated I have no
> place to bind it with the original loop and therefore I use vacant loop->aux
> for that.  Any other way to bind epilogue with its original loop would work
> as well.  I just chose loop->aux to avoid new fields and data structures.
>
>>
>>> +  /* FORNOW: Currently alias checks are not inherited for epilogues.
>>> + Don't try to vectorize epilogue because it will require
>>> + additional alias checks.  */
>>
>> Are the alias checks here redundant with the ones done for the original
>> loop?  If so won't DOM eliminate them?
>
> I revisited this part recently and thought it should actually be safe to
> assume we have no aliasing in epilogue because we are dominated by alias
> checks of the original loop.  So I prepared a patch to remove this restriction
> and avoid alias checks generation for epilogues (so we compute aliases checks
> required but don't emit them).  I didn't send this patch yet.
> Do you think it is a valid assumption?
I recently visited that part and agree it's valid, unless epilogue
loop is vectorized in larger vector-units, but that would be unlikely
to happen, right?  BTW, does this patch start all over analyzing
epilogue loop?  As you said the alias checks will be computed.

Thanks,
bin
>
>>
>>
>> And something just occurred to me -- is there some inherent reason why SLP
>> doesn't vectorize the epilogue, particularly for the cases where we can
>> vectorize the epilogue using smaller vectors?  Sorry if you've already
>> answered this somewhere or it's a dumb question.
>
> IIUC this may happen only if we unroll epilogue into a single BB which happens
> only when epilogue iterations count is known. Right?
>
>>
>>
>>
>>>
>>> +   /* Add new loop to a processing queue.  To make it easier
>>> +  to match loop and its epilogue vectorization in dumps
>>> +  put new loop as

Re: [Patch AArch64] Fixup to fcvt patterns added in r237200

2016-06-17 Thread James Greenhalgh

On Fri, Jun 17, 2016 at 04:25:31PM +0200, Christophe Lyon wrote:
> On 10 June 2016 at 14:29, James Greenhalgh  wrote:
> >
> > Hi,
> >
> > My autotester picked up some issues with the vcvt{ds}_n_* intrinsics
> > added in r237200.
> >
> Hi,
> 
> What tests does your autotester perform? I haven't noticed these
> problems when running the GCC testsuite on the usual aarch64
> targets. I'm interested in increasing coverage, if doable.

Hi Christophe,

I think we've spoken about this before [1], but the autotester is using
an internal testsuite that is not feasible to share upstream.

To see the sorts of tests that we're running, have a look at the LLVM
testsuite. If the layout of the testsuite hasn't changed since I last
looked, you should be able to find an example at:

/SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c

Hope that helps,
James

[1]: https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00775.html

Re: [PATCH,openacc] check for compatible loop parallelism with acc routine calls

2016-06-17 Thread Jakub Jelinek

On Wed, Jun 15, 2016 at 08:12:15PM -0700, Cesar Philippidis wrote:
> The second set of changes involves teaching the gimplifier to error when
> it detects a function call to an non-acc routines inside an OpenACC
> offloaded region. Actually, I relaxed non-acc routines by excluding
> calls to builtin functions, including those prefixed with _gfortran_.
> Nvptx does have a newlib c library, and it also has a subset of
> libgfortran. Still, this solution is probably not optimal.

I don't really like that, hardcoding prefixes or whatever is available
(you have quite some subset of libc, libm etc. available too) in the
compiler looks very hackish.  What is wrong with complaining during
linking of the offloaded code?

> Next, I had to modify the openacc header files in libgomp to mark
> acc_on_device as an acc routine. Unfortunately, this meant that I had to
> build the opeancc.mod module for gfortran with -fopenacc. But doing
> that, caused caused gcc to stream offloaded code to the openacc.o object
> file. So, I've updated the behavior of flag_generate_offload such that
> minus one indicates that the user specified -foffload=disable, and that
> will prevent gcc from streaming offloaded lto code. The alternative was
> to hack libtool to build libgomp with -foffload=disable.

This also looks wrong.  I'd say the right thing is when loading modules
that have OpenACC bits set in it (and also OpenMP bits, I admit I haven't
handled this well) into CU with the corresponding flags unset (-fopenacc,
-fopenmp, -fopenmp-simd here, depending on which bit it is), then
IMHO the module loading code should just ignore it, pretend it wasn't there.
Similarly e.g. to how lto1 with -g0 should ignore debug statements that
could be in the LTO inputs.

Jakub

Re: [PATCH][ARM] Delete thumb_reload_in_h

2016-06-17 Thread Kyrill Tkachov


Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00821.html

Thanks,
Kyrill
On 10/06/16 15:55, Kyrill Tkachov wrote:

Hi all,

This function just ICEs and isn't actually called from anywhere.
It was introduced back in 2000 as part of a large merge introducing Thumb 
support
and was aborting even then. I don't think having it around is of any benefit.

Tested on arm-none-eabi.

Ok for trunk?

Thanks,
Kyrill

2016-06-10  Kyrylo Tkachov  

* config/arm/arm.c (thumb_reload_in_hi): Delete.
* config/arm/arm-protos.h (thumb_reload_in_hi): Delete prototype.

Re: OpenACC wait clause

2016-06-17 Thread Jakub Jelinek

On Thu, Jun 16, 2016 at 08:22:29PM -0700, Cesar Philippidis wrote:
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -677,7 +677,6 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t 
> mask,
> && gfc_match ("async") == MATCH_YES)
>   {
> c->async = true;
> -   needs_space = false;
> if (gfc_match (" ( %e )", >async_expr) != MATCH_YES)
>   {
> c->async_expr
> @@ -685,6 +684,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t 
> mask,
>gfc_default_integer_kind,
>_current_locus);
> mpz_set_si (c->async_expr->value.integer, GOMP_ASYNC_NOVAL);
> +   needs_space = true;
>   }
> continue;
>   }
> @@ -1328,7 +1328,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t 
> mask,
> && gfc_match ("wait") == MATCH_YES)
>   {
> c->wait = true;
> -   match_oacc_expr_list (" (", >wait_list, false);
> +   if (match_oacc_expr_list (" (", >wait_list, false) == MATCH_NO)
> + needs_space = true;
> continue;
>   }
> if ((mask & OMP_CLAUSE_WORKER)

I think it is still problematic.  Most of the parsing fortran FE errors are 
deferred,
meaning that if you don't reject the whole gfc_match_omp_clauses, then no
diagnostics is actually emitted.  Both
gfc_match (" ( %e )", >async_expr) and match_oacc_expr_list (" (", 
>wait_list, false)
IMHO can return MATCH_YES, MATCH_NO and MATCH_ERROR, and I believe you need
to do different actions in each case.
In particular, if something is optional, then for MATCH_YES you should
accept it (continue) and not set needs_space, because after ) you don't need
space.  If MATCH_NO, then you should accept it too (because it is optional),
and set needs_space = true; first and perhaps do whatever else you need to
do.  If MATCH_ERROR, then you should make sure not to accept it, e.g. by
doing break; or making sure continue will not be done (which one depends on
whether it might be validly parsed as some other clause, which is very
likely not the case).  In the above changes, you do it all except for the
MATCH_ERROR handling, where you still do continue; and thus I bet
diagnostics for it won't be reported.
E.g. for
!$omp acc parallel async()
!$omp acc end parallel
end
no diagnostics is reported.  Looking around, there are many more issues like
that, e.g. match_oacc_clause_gang(c) (note, wrong formatting) also ignores
MATCH_ERROR, etc.

> @@ -1649,7 +1650,7 @@ gfc_match_oacc_wait (void)
>gfc_expr_list *wait_list = NULL, *el;
>  
>match_oacc_expr_list (" (", _list, true);
> -  gfc_match_omp_clauses (, OACC_WAIT_CLAUSES, false, false, true);
> +  gfc_match_omp_clauses (, OACC_WAIT_CLAUSES, false, true, true);
>  
>if (gfc_match_omp_eos () != MATCH_YES)
>  {

Can you explain this change?  I bet it again suffers from the above
mentioned issue.  If match_oacc_expr_list returns MATCH_YES, I believe you
want false, false, true as you don't need space in between the closing
) of the wait_list and name of next clause.  Note, does OpenACC allow also comma
in that case?
!$acc wait (whatever),async
?
If match_oacc_expr_list returns MATCH_NO, then IMHO it should be
true, true, true, because you don't want to accept
!$acc waitasync
and also don't want to accept
!$acc wait,async
And if match_oacc_expr_list returns MATCH_ERROR, you should reject it, so
that diagnostics is emitted.

Jakub

Re: i386/prologues: ROP mitigation for normal function epilogues

2016-06-17 Thread Bernd Schmidt


On 06/17/2016 04:29 PM, Michael Matz wrote:

On Fri, 17 Jun 2016, Bernd Schmidt wrote:

On 06/17/2016 04:03 PM, Michael Matz wrote:

But does this really improve something?  Essentially you're replacing

   0xc9 0xc3 

(the end of a function containing "leave;ret") with

   0xe9  

where the four random bytes are different for each rewritten function
return (but correlated as they differ exactly by their position
difference).

I'm not sure why the latter sequence is better?


I think I'm missing what you're trying to say. The latter sequence does not
contain a return opcode hence it ought to be better?


The "0xe9 " essentially is the leave+return opcode,
after all it jumps to them (let's ignore the possibility that the jump
target address might contain a 0xc3 byte).  So if the attacker finds some
interesting gadget in  I don't see how the change from
leave+ret to jump-to-leave+ret changes anything from a threat avoidance
perspective.  It's fully possible that I don't understand the threat
vector of ROP correctly, in which case I'd also like to know :)


The advantage is that this way the attack can't skip the leave opcode by 
jumping into the "random bytes1" in your first sequence. Hence, we 
ensure the return path will always overwrite esp first, which is what's 
supposed to make the attack harder since now you need to control ebp as 
well.



Bernd

Re: [PATCH, vec-tails 03/10] Support epilogues vectorization with no masking

2016-06-17 Thread Ilya Enkovich

2016-06-16 9:00 GMT+03:00 Jeff Law :
> On 05/19/2016 01:39 PM, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> This patch introduces changes required to run vectorizer on loop epilogue.
>> This also enables epilogue vectorization using a vector of smaller size.
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2016-05-19  Ilya Enkovich  
>>
>> * tree-if-conv.c (tree_if_conversion): Make public.
>> * tree-if-conv.h: New file.
>> * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Don't
>> try to enhance alignment for epilogues.
>> * tree-vect-loop-manip.c (vect_do_peeling_for_loop_bound): Return
>> created loop.
>> * tree-vect-loop.c: include tree-if-conv.h.
>> (destroy_loop_vec_info): Preserve LOOP_VINFO_ORIG_LOOP_INFO in
>> loop->aux.
>> (vect_analyze_loop_form): Init LOOP_VINFO_ORIG_LOOP_INFO and reset
>> loop->aux.
>> (vect_analyze_loop): Reset loop->aux.
>> (vect_transform_loop): Check if created epilogue should be
>> returned
>> for further vectorization.  If-convert epilogue if required.
>> * tree-vectorizer.c (vectorize_loops): Add a queue of loops to
>> process and insert vectorized loop epilogues into this queue.
>> * tree-vectorizer.h (vect_do_peeling_for_loop_bound): Return
>> created
>> loop.
>> (vect_transform_loop): Return created loop.
>
> As Richi noted, the additional calls into the if-converter are unfortunate.
> I'm not sure how else to avoid them though.  It looks like we can run
> if-conversion on just the epilogue, so maybe that's not too bad.
>
>
>> @@ -1212,8 +1213,8 @@ destroy_loop_vec_info (loop_vec_info loop_vinfo,
>> bool clean_stmts)
>>destroy_cost_data (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo));
>>loop_vinfo->scalar_cost_vec.release ();
>>
>> +  loop->aux = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo);
>>free (loop_vinfo);
>> -  loop->aux = NULL;
>>  }
>
> Hmm, there seems to be a level of indirection I'm missing here.  We're
> smuggling LOOP_VINFO_ORIG_LOOP_INFO around in loop->aux.  Ewww.  I thought
> the whole point of LOOP_VINFO_ORIG_LOOP_INFO was to smuggle the VINFO from
> the original loop to the vectorized epilogue.  What am I missing?  Rather
> than smuggling around in the aux field, is there some inherent reason why we
> can't just copy the info from the original loop directly into
> LOOP_VINFO_ORIG_LOOP_INFO for the vectorized epilogue?

LOOP_VINFO_ORIG_LOOP_INFO is used for several things:
 - mark this loop as epilogue
 - get VF of original loop (required for both mask and nomask modes)
 - get decision about epilogue masking

That's all.  When epilogue is created it has no LOOP_VINFO.  Also when we
vectorize loop we create and destroy its LOOP_VINFO multiple times.  When
loop has LOOP_VINFO loop->aux points to it and original LOOP_VINFO is in
LOOP_VINFO_ORIG_LOOP_INFO.  When Loop has no LOOP_VINFO associated I have no
place to bind it with the original loop and therefore I use vacant loop->aux
for that.  Any other way to bind epilogue with its original loop would work
as well.  I just chose loop->aux to avoid new fields and data structures.

>
>> +  /* FORNOW: Currently alias checks are not inherited for epilogues.
>> + Don't try to vectorize epilogue because it will require
>> + additional alias checks.  */
>
> Are the alias checks here redundant with the ones done for the original
> loop?  If so won't DOM eliminate them?

I revisited this part recently and thought it should actually be safe to
assume we have no aliasing in epilogue because we are dominated by alias
checks of the original loop.  So I prepared a patch to remove this restriction
and avoid alias checks generation for epilogues (so we compute aliases checks
required but don't emit them).  I didn't send this patch yet.
Do you think it is a valid assumption?

>
>
> And something just occurred to me -- is there some inherent reason why SLP
> doesn't vectorize the epilogue, particularly for the cases where we can
> vectorize the epilogue using smaller vectors?  Sorry if you've already
> answered this somewhere or it's a dumb question.

IIUC this may happen only if we unroll epilogue into a single BB which happens
only when epilogue iterations count is known. Right?

>
>
>
>>
>> +   /* Add new loop to a processing queue.  To make it easier
>> +  to match loop and its epilogue vectorization in dumps
>> +  put new loop as the next loop to process.  */
>> +   if (new_loop)
>> + {
>> +   loops.safe_insert (i + 1, new_loop->num);
>> +   vect_loops_num = number_of_loops (cfun);
>> + }
>> +
>
> So just to be clear, the only reason to do this is for dumps -- other than
> processing the loop before it's epilogue, there's no other inherently
> necessary ordering of the loops, right?

Right, I don't see other reasons to do it.

Thanks,
Ilya

>
>
> Jeff

Re: i386/prologues: ROP mitigation for normal function epilogues

2016-06-17 Thread Michael Matz

Hi,

On Fri, 17 Jun 2016, Bernd Schmidt wrote:

> On 06/17/2016 04:03 PM, Michael Matz wrote:
> > But does this really improve something?  Essentially you're replacing
> > 
> >0xc9 0xc3 
> > 
> > (the end of a function containing "leave;ret") with
> > 
> >0xe9  
> > 
> > where the four random bytes are different for each rewritten function
> > return (but correlated as they differ exactly by their position
> > difference).
> > 
> > I'm not sure why the latter sequence is better?
> 
> I think I'm missing what you're trying to say. The latter sequence does not
> contain a return opcode hence it ought to be better?

The "0xe9 " essentially is the leave+return opcode, 
after all it jumps to them (let's ignore the possibility that the jump 
target address might contain a 0xc3 byte).  So if the attacker finds some 
interesting gadget in  I don't see how the change from 
leave+ret to jump-to-leave+ret changes anything from a threat avoidance 
perspective.  It's fully possible that I don't understand the threat 
vector of ROP correctly, in which case I'd also like to know :)

Ciao,
Michael.

[PATCH] Change PRED_LOOP_EXIT from 92 to 85.

2016-06-17 Thread Martin Liška

Hello.

After we've recently applied various changes (fixes) to predict.c, SPEC2006
shows that PRED_LOOP_EXIT value should be amended.

Survives regression tests & bootstrap on x86_64-linux.
Pre-approved by Honza, installed as r237556.

Thanks,
Martin
>From 849c2e064bcadc269f82656d15722f28d1b1fe73 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 17 Jun 2016 14:44:24 +0200
Subject: [PATCH] Change PRED_LOOP_EXIT from 92 to 85.

contrib/ChangeLog:

2016-06-17  Martin Liska  

	* analyze_brprob.py: Fix columns of script output.

gcc/ChangeLog:

2016-06-17  Martin Liska  

	* predict.def: PRED_LOOP_EXIT from 92 to 85.

gcc/testsuite/ChangeLog:

2016-06-17  Martin Liska  

	* gcc.dg/predict-9.c: Fix dump scanning.
---
 contrib/analyze_brprob.py| 4 ++--
 gcc/predict.def  | 2 +-
 gcc/testsuite/gcc.dg/predict-9.c | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/contrib/analyze_brprob.py b/contrib/analyze_brprob.py
index 9808c46..2526623 100755
--- a/contrib/analyze_brprob.py
+++ b/contrib/analyze_brprob.py
@@ -119,10 +119,10 @@ class Profile:
 elif sorting == 'coverage':
 sorter = lambda x: x[1].count
 
-print('%-36s %8s %6s  %-16s %14s %8s %6s' % ('HEURISTICS', 'BRANCHES', '(REL)',
+print('%-40s %8s %6s  %-16s %14s %8s %6s' % ('HEURISTICS', 'BRANCHES', '(REL)',
   'HITRATE', 'COVERAGE', 'COVERAGE', '(REL)'))
 for (k, v) in sorted(self.heuristics.items(), key = sorter):
-print('%-36s %8i %5.1f%% %6.2f%% / %6.2f%% %14i %8s %5.1f%%' %
+print('%-40s %8i %5.1f%% %6.2f%% / %6.2f%% %14i %8s %5.1f%%' %
 (k, v.branches, percentage(v.branches, self.branches_max ()),
  percentage(v.hits, v.count), percentage(v.fits, v.count),
  v.count, v.count_formatted(), percentage(v.count, self.count_max()) ))
diff --git a/gcc/predict.def b/gcc/predict.def
index a0d0ba9..d3bc757 100644
--- a/gcc/predict.def
+++ b/gcc/predict.def
@@ -89,7 +89,7 @@ DEF_PREDICTOR (PRED_COLD_FUNCTION, "cold function call", PROB_VERY_LIKELY,
 	   PRED_FLAG_FIRST_MATCH)
 
 /* Edge causing loop to terminate is probably not taken.  */
-DEF_PREDICTOR (PRED_LOOP_EXIT, "loop exit", HITRATE (92),
+DEF_PREDICTOR (PRED_LOOP_EXIT, "loop exit", HITRATE (85),
 	   PRED_FLAG_FIRST_MATCH)
 
 /* Edge causing loop to terminate by computing value used by later
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index a613961..196e31c 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -19,5 +19,5 @@ void foo (int base)
   }
 }
 
-/* { dg-final { scan-tree-dump-times "first match heuristics: 2.0%" 3 "profile_estimate"} } */
-/* { dg-final { scan-tree-dump-times "first match heuristics: 4.0%" 1 "profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "first match heuristics: 3.0%" 3 "profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "first match heuristics: 7.5%" 1 "profile_estimate"} } */
-- 
2.8.3

Re: [ARM][testsuite] Make arm_neon_fp16 depend on arm_neon_ok

2016-06-17 Thread Kyrill Tkachov


Hi Christophe,

On 17/06/16 11:47, Christophe Lyon wrote:

Hi,

As discussed some time ago with Kyrylo (on IRC IIRC), the attached
patch makes sure that arm_neon_fp16_ok and arm_neonv2_ok effective
targets imply that arm_neon_ok passes, and use the corresponding
flags.

Without this patch, the 3 effective targets have different, possibly
inconsistent conditions. For instance, arm_neon_ok make sure that
__ARM_ARCH >= 7, but arm_neon_fp16_ok does not.

This led to failures on configurations not supporting neon, but where
arm_neon_fp16_ok passes as the test is less strict.
Rather than duplicating the same tests, I preferred to call
arm_neon_ok from the other places.

We then use the union of flags needed for arm_neon_ok and
arm_neon_fp16_ok to pass.

Tested on many arm configurations with no harm. It prevents
arm_neon_fp16 tests from passing when forcing -march=armv5t, that
seems coherent.

OK?


Ok with a ChangeLog nit below.
Thanks,
Kyrill


Christophe


2016-06-17  Christophe Lyon

* lib/target-supports.exp
(check_effective_target_arm_neon_fp16_ok_nocache): Call
arm_neon_ok and merge flags. Fix temporary test name.


I believe the rule in ChangeLogs is also to use two spaces after a full stop.
So two spaces before "Fix temporary..."

Re: [Patch AArch64] Fixup to fcvt patterns added in r237200

2016-06-17 Thread Christophe Lyon

On 10 June 2016 at 14:29, James Greenhalgh  wrote:
>
> Hi,
>
> My autotester picked up some issues with the vcvt{ds}_n_* intrinsics
> added in r237200.
>
Hi,

What tests does your autotester perform? I haven't noticed these
problems when running the GCC testsuite on the usual aarch64
targets. I'm interested in increasing coverage, if doable.

Thanks

Christophe

> The iterators in this pattern do not resolve, as they have not been
> explicitly tied to the mode iterator (rather than the code iterator)
> used by the pattern.
>
> This fixup adds the attribute tags, allowing the patterns to work
> correctly.
>
> Additionally, the types assigned to these instructions were wrong, and
> would permit the immediate operand to be in a register. This will then
> develop in to an ICE as the patterns require an immediate operand, and so
> won't match. The ICE can be exposed by writing a wrapping function around
> the vcvtd_n_* intrinsics, which forces the immediate operand to a register.
> We have the infrastructure to error to the user rather than ICEing, but it
> needs some different types, which this patch adds.
>
> I've checked this with an aarch64-none-elf test run, and run it through
> several rounds of my autotester for aarch64-none-elf and
> aarch64_be-none-elf.
>
> OK?
>
> Thanks,
> James
>
> ---
> 2016-06-10  James Greenhalgh  
>
> * config/aarch64/aarch64.md
> (3): Add attributes to
> iterators.
> (3): Likewise.  Correct
> attributes.
> * config/aarch64/aarch64-builtins.c
> (aarch64_types_binop_uss_qualifiers): Delete.
> (TYPES_BINOP_USS): Likewise.
> (aarch64_types_binop_sus_qualifiers): Likewise.
> (TYPES_BINOP_SUS): Likewise.
> (aarch64_types_fcvt_from_unsigned_qualifiers): New.
> (TYPES_FCVTIMM_SUS): Likewise.
> * config/aarch64/aarch64-simd-builtins.def (scvtf): Use SHIFTIMM
> rather than BINOP.
> (ucvtf): Use FCVTIMM_SUS rather than BINOP_SUS.
> (fcvtzs): Use SHIFTIMM rather than BINOP.
> (fcvtzu): Use SHIFTIMM_USS rather than BINOP_USS.
>

Re: move increase_alignment from simple to regular ipa pass

2016-06-17 Thread Prathamesh Kulkarni

On 14 June 2016 at 18:31, Prathamesh Kulkarni
 wrote:
> On 13 June 2016 at 16:13, Jan Hubicka  wrote:
>>> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
>>> index ecafe63..41ac408 100644
>>> --- a/gcc/cgraph.h
>>> +++ b/gcc/cgraph.h
>>> @@ -1874,6 +1874,9 @@ public:
>>>   if we did not do any inter-procedural code movement.  */
>>>unsigned used_by_single_function : 1;
>>>
>>> +  /* Set if -fsection-anchors is set.  */
>>> +  unsigned section_anchor : 1;
>>> +
>>>  private:
>>>/* Assemble thunks and aliases associated to varpool node.  */
>>>void assemble_aliases (void);
>>> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
>>> index 4bfcad7..e75d5c0 100644
>>> --- a/gcc/cgraphunit.c
>>> +++ b/gcc/cgraphunit.c
>>> @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl)
>>>   it is available to notice_global_symbol.  */
>>>node->definition = true;
>>>notice_global_symbol (decl);
>>> +
>>> +  node->section_anchor = flag_section_anchors;
>>> +
>>>if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl)
>>>/* Traditionally we do not eliminate static variables when not
>>>optimizing and when not doing toplevel reoder.  */
>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>> index f0d7196..e497795 100644
>>> --- a/gcc/common.opt
>>> +++ b/gcc/common.opt
>>> @@ -1590,6 +1590,10 @@ fira-algorithm=
>>>  Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) 
>>> Init(IRA_ALGORITHM_CB) Optimization
>>>  -fira-algorithm=[CB|priority] Set the used IRA algorithm.
>>>
>>> +fipa-increase_alignment
>>> +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization
>>> +Option to gate increase_alignment ipa pass.
>>> +
>>>  Enum
>>>  Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA 
>>> algorithm %qs)
>>>
>>> @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) 
>>> Init(1) Optimization
>>>  Enable the dependent count heuristic in the scheduler.
>>>
>>>  fsection-anchors
>>> -Common Report Var(flag_section_anchors) Optimization
>>> +Common Report Var(flag_section_anchors)
>>>  Access data in the same section from shared anchor points.
>>>
>>>  fsee
>>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>>> index a0db3a4..1482566 100644
>>> --- a/gcc/config/aarch64/aarch64.c
>>> +++ b/gcc/config/aarch64/aarch64.c
>>> @@ -8252,6 +8252,8 @@ aarch64_override_options (void)
>>>
>>>aarch64_register_fma_steering ();
>>>
>>> +  /* Enable increase_alignment pass.  */
>>> +  flag_ipa_increase_alignment = 1;
>>
>> I would rather enable it always on targets that do support anchors.
> AFAIK aarch64 supports section anchors.
>>> diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
>>> index ce9e146..7f09f3a 100644
>>> --- a/gcc/lto/lto-symtab.c
>>> +++ b/gcc/lto/lto-symtab.c
>>> @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, symtab_node 
>>> *entry)
>>>   The type compatibility checks or the completing of types has properly
>>>   dealt with most issues.  */
>>>
>>> +  /* ??? is this assert necessary ?  */
>>> +  varpool_node *v_prevailing = dyn_cast (prevailing);
>>> +  varpool_node *v_entry = dyn_cast (entry);
>>> +  gcc_assert (v_prevailing && v_entry);
>>> +  /* section_anchor of prevailing_decl wins.  */
>>> +  v_entry->section_anchor = v_prevailing->section_anchor;
>>> +
>> Other flags are merged in lto_varpool_replace_node so please move this there.
> Ah indeed, thanks for the pointers.
> I wonder though if we need to set
> prevailing_node->section_anchor = vnode->section_anchor ?
> IIUC, the function merges flags from vnode into prevailing_node
> and removes vnode. However we want prevailing_node->section_anchor
> to always take precedence.
>>> +/* Return true if alignment should be increased for this vnode.
>>> +   This is done if every function that references/referring to vnode
>>> +   has flag_tree_loop_vectorize set.  */
>>> +
>>> +static bool
>>> +increase_alignment_p (varpool_node *vnode)
>>> +{
>>> +  ipa_ref *ref;
>>> +
>>> +  for (int i = 0; vnode->iterate_reference (i, ref); i++)
>>> +if (cgraph_node *cnode = dyn_cast (ref->referred))
>>> +  {
>>> + struct cl_optimization *opts = opts_for_fn (cnode->decl);
>>> + if (!opts->x_flag_tree_loop_vectorize)
>>> +   return false;
>>> +  }
>>
>> If you take address of function that has vectorizer enabled probably doesn't
>> imply need to increase alignment of that var. So please drop the loop.
>>
>> You only want function that read/writes or takes address of the symbol. But
>> onthe other hand, you need to walk all aliases of the symbol by
>> call_for_symbol_and_aliases
>>> +
>>> +  for (int i = 0; vnode->iterate_referring (i, ref); i++)
>>> +if (cgraph_node *cnode = dyn_cast (ref->referring))
>>> +  {
>>> + struct cl_optimization *opts = opts_for_fn (cnode->decl);
>>> + if (!opts->x_flag_tree_loop_vectorize)
>>> +   return

[Bug c/71567] Incorrect loop optimization

2016-06-17 Thread pinskia at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71567

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Andrew Pinski  ---
This:
*(gr->ggint[i]) != '\0' && i < NIN 

is reading one past the array bounds.  Swapping the operands of && fixes the
problem.

The reason dummy influences GCC optimization is that if it is not there, GCC
assumes struct1 has a flexible array at the end of it.

[patch] allow --target=e500v[12]-* in configure

2016-06-17 Thread Jérôme Lambourg

Hello,

An initial patch has been integrated into gnu-config to translate triplets like
e500v2-*- into powerpc-*-spe.

The spe extension to the os is expected for targets such as e500v[12]-*-linux
(translated as powerpc-*-linux-gnuspe) or eabi targets.

This patch integrates the patch of gnu-config (config.sub), and takes care of
the vxworks case, as well as properly set the default value for --with-cpu.

I checked that this works with the following targets:
* e500v2-wrs-vxworks
* e500v2-gnu-linux

Thanks in advance for your feedback,

- Jérôme

Author: Jerome Lambourg 
Date:   Tue Jun 14 10:57:06 2016 +0200

P614-008: support e500v[12] configuration as PPC with cpu 854[08]

toplevel/
* config.sub: merge with gnu-config trunk. Accept e500v[12] cpu names, and
canonicalize to powerpc, and add a "spe" suffix to the os name.
* gcc/config.gcc: determine with_cpu from the non canonical target name, and
make sure the powerpc-wrs-vxworks*spe is properly handled.
* libgcc/config.host: accept vxworks*spe when configuring libgcc.



gcc-e500v12-config.diff
Description: Binary data

Re: i386/prologues: ROP mitigation for normal function epilogues

2016-06-17 Thread Bernd Schmidt


On 06/17/2016 04:03 PM, Michael Matz wrote:

But does this really improve something?  Essentially you're replacing

   0xc9 0xc3 

(the end of a function containing "leave;ret") with

   0xe9  

where the four random bytes are different for each rewritten function
return (but correlated as they differ exactly by their position
difference).

I'm not sure why the latter sequence is better?


I think I'm missing what you're trying to say. The latter sequence does 
not contain a return opcode hence it ought to be better?



Bernd

Re: [PATCH, vec-tails 03/10] Support epilogues vectorization with no masking

2016-06-17 Thread Ilya Enkovich

2016-06-16 8:22 GMT+03:00 Jeff Law :
> On 06/15/2016 05:03 AM, Richard Biener wrote:
>>
>> On Thu, May 19, 2016 at 9:39 PM, Ilya Enkovich
>>  wrote:
>>>
>>> Hi,
>>>
>>> This patch introduces changes required to run vectorizer on loop
>>> epilogue. This also enables epilogue vectorization using a vector
>>> of smaller size.
>>
>>
>> While the idea of epilogue vectorization sounds straight-forward the
>> implementation is somewhat icky with all the ->aux stuff, "redundant"
>> if-conversion and loop iteration stuff.
>>
>> So I was thinking of when epilogue vectorization is beneficial which
>> is obviously when the overall loop trip count is low.  We are not
>> good in optimizing for that case generally (too much peeling for
>> alignment, using expensive avx256 vectorization, etc.), so I wonder
>> if versioning for that case would be a better idea
>> (performance-wise).
>>
>> Thus - what cases were you looking at when deciding that vectorizing
>> the epilogue (with a smaller vector size) is profitable?  Do other
>> compilers generally do this?
>
> I would think it's better stated that the relative benefits of vectorizing
> the epilogue are greater the shorter the loop, but that's nit-picking the
> discussion.
>
> I do think you've got a legitimate question though.   Ilya, can you give any
> insights here based on your KNL and Haswell testing or data/insights from
> the LLVM and/or ICC teams?

I have no information about LLVM.  As I said in other thread ICC uses all
options (masked epilogue, combined loop, vectorized epilogue with smaller
vector size).  It also may generate different versions (e.g. combined and
with masked epilogue) and choose dynamically depending on iterations count.

Thanks,
Ilya

>
> Jeff

Re: [openacc] clean up acc directive matching in fortran

2016-06-17 Thread Jakub Jelinek

On Fri, Jun 17, 2016 at 10:40:40AM +0200, Tobias Burnus wrote:
> Cesar Philippidis wrote:
> > On 06/16/2016 08:30 PM, Cesar Philippidis wrote:
> > > This patch introduces a match_acc function to the fortran FE. It's
> > > almost identical to match_omp, but it passes openacc = true to
> > > gfc_match_omp_clauses. I supposed I could have consolidated those two
> > > functions, but they are reasonably simple so I left them separate. Maybe
> > > a follow up patch can consolidate them. I was able to eliminate a lot of
> > > duplicate code with this function.
> > > 
> > > Is this ok for trunk and gcc-6?
> 
> > And here's the patch.
> 
> The patch seems to be reverse. If I regard the "-" lines as additions
> and the "+" lines as deletions, it makes sense and is in line with
> the ChangeLog and what you wrote above.
> 
> Otherwise, it looks good to me.

Yeah, patch -R + commit is ok with me.

Jakub

[Bug c++/71508] Huge memory usage on compiling with many types

2016-06-17 Thread olarupaulstelian97+bugzilla at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71508

--- Comment #2 from Paul Olaru  
---
Created attachment 38716
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38716=edit
Preprocessed source (gzipped)

The uncompressed file is barely above 1MB. Is it normal for it to be this
large?

[committed] Further OpenMP C++ mapping of struct elements with reference to struct as base fixes

2016-06-17 Thread Jakub Jelinek

On Thu, Jun 16, 2016 at 09:05:40PM +0200, Jakub Jelinek wrote:
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, and
> tested with x86_64-intelmicemul-linux offloading on x86_64-linux, committed
> to trunk.

And the following testcase shows similar issues in the array section
handling path.
Tested on x86_64-linux and i686-linux, plus with intelmicemul offloading,
committed to trunk.

2016-06-17  Jakub Jelinek  

* semantics.c (handle_omp_array_sections_1): Don't ICE when
processing_template_decl when checking for bitfields and unions.
Look through REFERENCE_REF_P as base of COMPONENT_REF.
(finish_omp_clauses): Look through REFERENCE_REF_P even for
array sections with COMPONENT_REF bases.

* testsuite/libgomp.c++/target-21.C: New test.

--- gcc/cp/semantics.c.jj   2016-06-16 17:29:53.0 +0200
+++ gcc/cp/semantics.c  2016-06-17 14:34:21.929203440 +0200
@@ -4487,7 +4487,8 @@ handle_omp_array_sections_1 (tree c, tre
  || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM)
  && !type_dependent_expression_p (t))
{
- if (DECL_BIT_FIELD (TREE_OPERAND (t, 1)))
+ if (TREE_CODE (TREE_OPERAND (t, 1)) == FIELD_DECL
+ && DECL_BIT_FIELD (TREE_OPERAND (t, 1)))
{
  error_at (OMP_CLAUSE_LOCATION (c),
"bit-field %qE in %qs clause",
@@ -4496,7 +4497,8 @@ handle_omp_array_sections_1 (tree c, tre
}
  while (TREE_CODE (t) == COMPONENT_REF)
{
- if (TREE_CODE (TREE_TYPE (TREE_OPERAND (t, 0))) == UNION_TYPE)
+ if (TREE_TYPE (TREE_OPERAND (t, 0))
+ && TREE_CODE (TREE_TYPE (TREE_OPERAND (t, 0))) == UNION_TYPE)
{
  error_at (OMP_CLAUSE_LOCATION (c),
"%qE is a member of a union", t);
@@ -4504,6 +4506,8 @@ handle_omp_array_sections_1 (tree c, tre
}
  t = TREE_OPERAND (t, 0);
}
+ if (REFERENCE_REF_P (t))
+   t = TREE_OPERAND (t, 0);
}
   if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL)
{
@@ -6623,6 +6627,8 @@ finish_omp_clauses (tree clauses, enum c
{
  while (TREE_CODE (t) == COMPONENT_REF)
t = TREE_OPERAND (t, 0);
+ if (REFERENCE_REF_P (t))
+   t = TREE_OPERAND (t, 0);
  if (bitmap_bit_p (_field_head, DECL_UID (t)))
break;
  if (bitmap_bit_p (_head, DECL_UID (t)))
--- libgomp/testsuite/libgomp.c++/target-21.C.jj2016-06-17 
13:18:59.684314656 +0200
+++ libgomp/testsuite/libgomp.c++/target-21.C   2016-06-17 15:10:21.860516966 
+0200
@@ -0,0 +1,173 @@
+extern "C" void abort ();
+struct T { char t[270]; };
+struct S { int ()[10]; int * T t; int  S (); ~S (); };
+
+template 
+void
+foo (S s)
+{
+  int err;
+  #pragma omp target map (s.x[0:N], s.y[0:N]) map (s.t.t[16:3]) map (from: err)
+  {
+err = s.x[2] != 28 || s.y[2] != 37 || s.t.t[17] != 81;
+s.x[2]++;
+s.y[2]++;
+s.t.t[17]++;
+  }
+  if (err || s.x[2] != 29 || s.y[2] != 38 || s.t.t[17] != 82)
+abort ();
+}
+
+template 
+void
+bar (S s)
+{
+  int err;
+  #pragma omp target map (s.x, s.z)map(from:err)
+  {
+err = s.x[2] != 29 || s.z != 6;
+s.x[2]++;
+s.z++;
+  }
+  if (err || s.x[2] != 30 || s.z != 7)
+abort ();
+}
+
+template 
+void
+foo2 (S )
+{
+  int err;
+  #pragma omp target map (s.x[N:10], s.y[N:10]) map (from: err) map 
(s.t.t[N+16:N+3])
+  {
+err = s.x[2] != 30 || s.y[2] != 38 || s.t.t[17] != 81;
+s.x[2]++;
+s.y[2]++;
+s.t.t[17]++;
+  }
+  if (err || s.x[2] != 31 || s.y[2] != 39 || s.t.t[17] != 82)
+abort ();
+}
+
+template 
+void
+bar2 (S )
+{
+  int err;
+  #pragma omp target map (s.x, s.z)map(from:err)
+  {
+err = s.x[2] != 31 || s.z != 7;
+s.x[2]++;
+s.z++;
+  }
+  if (err || s.x[2] != 32 || s.z != 8)
+abort ();
+}
+
+template 
+void
+foo3 (U s)
+{
+  int err;
+  #pragma omp target map (s.x[0:10], s.y[0:10]) map (from: err) map 
(s.t.t[16:3])
+  {
+err = s.x[2] != 32 || s.y[2] != 39 || s.t.t[17] != 82;
+s.x[2]++;
+s.y[2]++;
+s.t.t[17]++;
+  }
+  if (err || s.x[2] != 33 || s.y[2] != 40 || s.t.t[17] != 83)
+abort ();
+}
+
+template 
+void
+bar3 (U s)
+{
+  int err;
+  #pragma omp target map (s.x, s.z)map(from:err)
+  {
+err = s.x[2] != 33 || s.z != 8;
+s.x[2]++;
+s.z++;
+  }
+  if (err || s.x[2] != 34 || s.z != 9)
+abort ();
+}
+
+template 
+void
+foo4 (U )
+{
+  int err;
+  #pragma omp target map (s.x[0:10], s.y[0:10]) map (from: err) map 
(s.t.t[16:3])
+  {
+err = s.x[2] != 34 || s.y[2] != 40 || s.t.t[17] != 82;
+s.x[2]++;
+s.y[2]++;
+s.t.t[17]++;
+  }
+  if (err || s.x[2] != 35 || s.y[2] != 41 || s.t.t[17] != 83)
+abort ();
+}
+
+template 
+void
+bar4 (U )
+{
+  int err;
+  #pragma omp

Re: i386/prologues: ROP mitigation for normal function epilogues

2016-06-17 Thread Michael Matz

Hi,

On Fri, 17 Jun 2016, Bernd Schmidt wrote:

> This is another step to flesh out -mmitigate-rop for i386 a little more. 
> The basic idea was (I think) Richard Henderson's: if we could arrange to 
> have every return preceded by a leave instruction, it would make it 
> harder to construct an attack since it takes away a certain amount of 
> control over the stack pointer. I extended this to move the leave/ret 
> pair to libgcc, preceded by a sequence of nops, so as to take away the 
> possibility of jumping into the middle of an instruction preceding the 
> leave/ret pair and thereby skipping the leave.

But does this really improve something?  Essentially you're replacing

   0xc9 0xc3 

(the end of a function containing "leave;ret") with

   0xe9  

where the four random bytes are different for each rewritten function 
return (but correlated as they differ exactly by their position 
difference).

I'm not sure why the latter sequence is better?

Ciao,
Michael.

[Bug tree-optimization/71354] [7 Regression] gcc.dg/vect/vect-23.c FAILs

2016-06-17 Thread amker at gcc dot gnu.org

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71354

--- Comment #3 from amker at gcc dot gnu.org ---
Author: amker
Date: Fri Jun 17 13:55:06 2016
New Revision: 237555

URL: https://gcc.gnu.org/viewcvs?rev=237555=gcc=rev
Log:

PR tree-optimization/71354
* gcc.dg/vect/vect-23.c: Use vect_condition instead of vect_cond.


Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/vect/vect-23.c

[Bug c/71567] New: Incorrect loop optimization

2016-06-17 Thread tyoma.ariv at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71567

Bug ID: 71567
   Summary: Incorrect loop optimization
   Product: gcc
   Version: 4.8.5
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tyoma.ariv at gmail dot com
  Target Milestone: ---

#define NIN 2


struct struct1 {
char ggint[NIN][6];
char dummy;
};
/*--*/
int get_struct1cnt(struct struct1 *gr)
{
int i=0 ;
for ( i=0 ; *(gr->ggint[i]) != '\0' && i < NIN ; ++i )
;
return i ;
}
/*--*/
int main (int argc, char **argv)
{
struct struct1 grp = {0};

for ( int i=0 ; i < NIN ; i++ )
grp.ggint[i][0] = 'A'+i;
printf("%d\n", get_struct1cnt ());

return 0;
}

When compiled with gcc -O -std=c99 one.c: the program display 1.
When compiled with gcc -std=c99 one.c: the programm display 2 as expcted

If optimization level is -Og or not, the program displays 2, if optimization
level is -O or higher, the program displays 1.

Note: removing dummy field from struct struct1 make this bug not reproducable

[Bug fortran/71544] gfortran compiler optimization bug when dealing with c-style pointers

2016-06-17 Thread fortranbug at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71544

--- Comment #6 from fortranbug at gmail dot com ---
Thank you for the suggested workaround. This can certainly be helpful in the
short term.  However, we would not want to rely on tuning compiler optimization
switches in the future to ensure correct code operation.  (Our system is used
by a community of users and it is not possible for us to ensure that they all
use specific optimizations switches.)

Can someone clarify the status of this bug report?  Is this a recognized issue
that will be addressed at some point?

[PATCH] Fix memory leak in tree-ssa-reassoc.c

2016-06-17 Thread Martin Liška

Hi.

Following simple patch fixes a newly introduced memory leak.

Patch survives regression tests and bootstraps on x86_64-linux.

Ready from trunk?
Thanks,
Martin
>From a2e6be16d7079b744db4d383b8317226ab53ff58 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 17 Jun 2016 12:26:58 +0200
Subject: [PATCH] Fix memory leak in tree-ssa-reassoc.c

gcc/ChangeLog:

2016-06-17  Martin Liska  

	* tree-ssa-reassoc.c (transform_add_to_multiply): Use auto_vec.
---
 gcc/tree-ssa-reassoc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index e32d503..cdfe06f 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -1807,7 +1807,7 @@ transform_add_to_multiply (vec *ops)
   tree op = NULL_TREE;
   int j;
   int i, start = -1, end = 0, count = 0;
-  vec > indxs = vNULL;
+  auto_vec > indxs;
   bool changed = false;
 
   if (!INTEGRAL_TYPE_P (TREE_TYPE ((*ops)[0]->op))
-- 
2.8.3

Re: [PATCH] Add port for Phoenix-RTOS on ARM platform.

2016-06-17 Thread Jakub Sejdak

> So at least in the immediate term let's get you write privileges so you can
> commit approved changes and on the path towards maintaining the Phoenix-RTOS
> configurations.

Do I have to apply for this permission somewhere? Provided page states
only, that it has to be granted by an existing maintainer.

2016-06-16 18:28 GMT+02:00 Jeff Law :
> On 06/16/2016 02:59 AM, Jakub Sejdak wrote:
>>
>> Actually, if possible, I would skip the "arm" part, because we plan to
>> port Phoenix-RTOS for other platforms. It will be easier to do it
>> once.
>
> Generally we prefer to see an ongoing commitment to the GCC project along
> with regular high quality contributions to appoint maintainers.
>
> So at least in the immediate term let's get you write privileges so you can
> commit approved changes and on the path towards maintaining the Phoenix-RTOS
> configurations.
>
> https://www.gnu.org/software/gcc/svnwrite.html
>
> jeff
>



-- 
Jakub Sejdak
Software Engineer
Phoenix Systems (www.phoesys.com)
+48 608 050 163

Re: Fix loop size estimate in tree-ssa-loop-ivcanon

2016-06-17 Thread Christophe Lyon

On 16 June 2016 at 14:56, Jan Hubicka  wrote:
> Hi,
> tree_estimate_loop_size contains one extra else that prevents it from 
> determining
> that the induction variable comparsion is going to be eliminated in both the 
> peeled
> copies as well as the last copy.  This patch fixes it
> (it really removes one else, but need to reformat the conditional)
>
> Bootstrapped/regtested x86_64-linux, comitted.
>
> Honza
>
> * g++.dg/vect/pr36648.cc: Disable cunrolli
> * tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Fix estimation
> of comparsions in the last iteration.

Hi,

This patch makes
FAIL: gcc.target/arm/unsigned-extend-2.c scan-assembler ands
on arm-none-linux-gnueabi --with-cpu=cortex-a9

Christophe




> Index: testsuite/g++.dg/vect/pr36648.cc
> ===
> --- testsuite/g++.dg/vect/pr36648.cc(revision 237477)
> +++ testsuite/g++.dg/vect/pr36648.cc(working copy)
> @@ -1,4 +1,5 @@
>  /* { dg-require-effective-target vect_float } */
> +// { dg-additional-options "-fdisable-tree-cunrolli" }
>
>  struct vector
>  {
> Index: tree-ssa-loop-ivcanon.c
> ===
> --- tree-ssa-loop-ivcanon.c (revision 237477)
> +++ tree-ssa-loop-ivcanon.c (working copy)
> @@ -255,69 +255,73 @@ tree_estimate_loop_size (struct loop *lo
>
>   /* Look for reasons why we might optimize this stmt away. */
>
> - if (gimple_has_side_effects (stmt))
> -   ;
> - /* Exit conditional.  */
> - else if (exit && body[i] == exit->src
> -  && stmt == last_stmt (exit->src))
> + if (!gimple_has_side_effects (stmt))
> {
> - if (dump_file && (dump_flags & TDF_DETAILS))
> -   fprintf (dump_file, "   Exit condition will be eliminated "
> -"in peeled copies.\n");
> - likely_eliminated_peeled = true;
> -   }
> - else if (edge_to_cancel && body[i] == edge_to_cancel->src
> -  && stmt == last_stmt (edge_to_cancel->src))
> -   {
> - if (dump_file && (dump_flags & TDF_DETAILS))
> -   fprintf (dump_file, "   Exit condition will be eliminated "
> -"in last copy.\n");
> - likely_eliminated_last = true;
> -   }
> - /* Sets of IV variables  */
> - else if (gimple_code (stmt) == GIMPLE_ASSIGN
> - && constant_after_peeling (gimple_assign_lhs (stmt), stmt, 
> loop))
> -   {
> - if (dump_file && (dump_flags & TDF_DETAILS))
> -   fprintf (dump_file, "   Induction variable computation will"
> -" be folded away.\n");
> - likely_eliminated = true;
> -   }
> - /* Assignments of IV variables.  */
> - else if (gimple_code (stmt) == GIMPLE_ASSIGN
> -  && TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME
> -  && constant_after_peeling (gimple_assign_rhs1 (stmt), stmt,
> - loop)
> -  && (gimple_assign_rhs_class (stmt) != GIMPLE_BINARY_RHS
> -  || constant_after_peeling (gimple_assign_rhs2 (stmt),
> - stmt, loop)))
> -   {
> - size->constant_iv = true;
> - if (dump_file && (dump_flags & TDF_DETAILS))
> -   fprintf (dump_file,
> -"   Constant expression will be folded away.\n");
> - likely_eliminated = true;
> -   }
> - /* Conditionals.  */
> - else if ((gimple_code (stmt) == GIMPLE_COND
> -   && constant_after_peeling (gimple_cond_lhs (stmt), stmt,
> -  loop)
> -   && constant_after_peeling (gimple_cond_rhs (stmt), stmt,
> -  loop)
> -   /* We don't simplify all constant compares so make sure
> -  they are not both constant already.  See PR70288.  */
> -   && (! is_gimple_min_invariant (gimple_cond_lhs (stmt))
> -   || ! is_gimple_min_invariant (gimple_cond_rhs 
> (stmt
> -  || (gimple_code (stmt) == GIMPLE_SWITCH
> -  && constant_after_peeling (gimple_switch_index (
> -   as_a  (stmt)),
> + /* Exit conditional.  */
> + if (exit && body[i] == exit->src
> + && stmt == last_stmt (exit->src))
> +   {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> +   fprintf (dump_file, "   Exit condition will be eliminated 
> "
> +"in peeled copies.\n");
> + likely_eliminated_peeled = true;
> +

Re: [Patch, avr] Fix PR 71151

2016-06-17 Thread Georg-Johann Lay


Senthil Kumar Selvaraj schrieb:

Hi,

  This patch fixes PR 71151 by eliminating the
  TARGET_ASM_FUNCTION_RODATA_SECTION hook and setting
  JUMP_TABLES_IN_TEXT_SECTION to 1.

  As described in the bugzilla entry, this hook assumed it will get
  called only for jumptable rodata for functions. This was true until
  6.1, when a commit in varasm.c started calling the hook for mergeable
  string/constant data as well.

  This resulted in string constants ending up in a section intended for
  jumptables (flash), and broke code using those constants, which
  expects them to be present in rodata (SRAM).

  Given that the original reason for placing jumptables in a section was
  fixed by Johann in PR 63323, this patch restores the original
  behavior. Reg testing on both gcc-6-branch and trunk showed no regressions.

  As pointed out by Johann, this may end up increasing code
  size if there are lots of branches that cross the jump tables. I
  intend to propose a separate patch that gives additional information
  to the target hook (SECCAT_RODATA_{STRING,JUMPTABLE}) so it can know
  what type of function rodata is coming on. Johann also suggested
  handling jump table generation ourselves - I'll experiment with that
  some more.

  If ok, could someone commit please? Could you also backport to
  gcc-6-branch?

Regards
Senthil

gcc/ChangeLog

2016-06-03  Senthil Kumar Selvaraj  



Missing PR target/71151


* config/avr/avr.c (avr_asm_function_rodata_section): Remove.
* config/avr/avr.c (TARGET_ASM_FUNCTION_RODATA_SECTION): Remove.

gcc/testsuite/ChangeLog

2016-06-03  Senthil Kumar Selvaraj  



Missing PR target/71151


* gcc/testsuite/gcc.target/avr/pr71151-1.c: New.
* gcc/testsuite/gcc.target/avr/pr71151-2.c: New.



With the PR entry in the ChangeLog / commit message it might be easier 
to find the change, and the respective bugzilla PR will get an automatic 
entry pointing to the commit.


Thanks,  Johann

[Bug c++/59861] Inconsistent error output format

2016-06-17 Thread milasudril at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59861

milasudril at gmail dot com changed:

   What|Removed |Added

   Severity|minor   |enhancement

[Bug c++/71541] destructor of condition_variable_any crashes with static linkage

2016-06-17 Thread gcc_bugzilla at haphi dot de

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71541

--- Comment #6 from Hans Philipp Annen  ---
>  -Wl,--whole-archive -lpthread -Wl,--no-whole-archive
This did help.
Without it, the destructor of condition_variable just jumps to nowhere:

> Dump of assembler code for function _ZNSt18condition_variableD2Ev:
> => 0x00402ef0 <+0>: jmpq   0x0
> End of assembler dump.

With -Wl,--whole-archive -lpthread: 
> Dump of assembler code for function _ZNSt18condition_variableD2Ev:
> => 0x0040f240 <+0>: jmpq   0x407df0 
> End of assembler dump.

[C++ Patch] One more error + error to error + inform

2016-06-17 Thread Paolo Carlini


Hi,

one more I missed. Tested x86_64-linux. Should be obvious too...

Thanks,
Paolo.

PS: I still have pending: 
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01116.html


//
/cp
2016-06-17  Paolo Carlini  

* decl.c (grokfndecl): Change pair of errors to error + inform.

/testsuite
2016-06-17  Paolo Carlini  

* g++.dg/cpp0x/defaulted31.C: Adjust for dg-message vs dg-error.
Index: cp/decl.c
===
--- cp/decl.c   (revision 237547)
+++ cp/decl.c   (working copy)
@@ -8295,7 +8295,8 @@ grokfndecl (tree ctype,
  else if (DECL_DEFAULTED_FN (old_decl))
{
  error ("definition of explicitly-defaulted %q+D", decl);
- error ("%q+#D explicitly defaulted here", old_decl);
+ inform (DECL_SOURCE_LOCATION (old_decl),
+ "%q#D explicitly defaulted here", old_decl);
  return NULL_TREE;
}
 
Index: testsuite/g++.dg/cpp0x/defaulted31.C
===
--- testsuite/g++.dg/cpp0x/defaulted31.C(revision 237547)
+++ testsuite/g++.dg/cpp0x/defaulted31.C(working copy)
@@ -4,7 +4,7 @@
 struct A
 {
   A() { }  // { dg-message "defined" }
-  ~A() = default;  // { dg-error "defaulted" }
+  ~A() = default;  // { dg-message "defaulted" }
 };
 
 A::A() = default;  // { dg-error "redefinition" }

Re: i386/prologues: ROP mitigation for normal function epilogues

2016-06-17 Thread Bernd Schmidt


On 06/17/2016 12:37 PM, Jakub Jelinek wrote:


Do you really need to require frame pointer for this?
I mean, couldn't you instead use what you do if a function needs frame
pointer and otherwise just replace the original ret with
pushq   %rbp
movq%rsp, %rbp
jmp __rop_ret
?  Or would that defeat the purpose of the mitigation?


Yes, kind of, because then you can jump into code before this little 
sequence and the whole pushq/movq/jmp/leave/ret would just behave like a 
normal ret. This is admittedly a concern for smaller functions that look 
a lot like this; maybe we need to pad function entry points as well.



As for __rop_ret, if you are non-PLT jmp to it, I bet it must be in the same
executable or shared library as the code branching to it, so should be
.hidden.  Is libgcc.a really the best place for it though?


I declare myself agnostic.


Looking at nop; nop; 1: jmp 1b; leave; ret
if you branch into the middle of the jmp insn (0x3 below), there is:
   0:   90  nop
   1:   90  nop
   2:   eb fe   jmp0x2
   4:   c9  leaveq
   5:   c3  retq
and thus:
   3:   fe c9   dec%cl
   5:   c3  retq
and thus if you don't mind decreasing %cl, you still have retq without leave
before it.  But I very likely just don't understand the ROP threat stuff
enough.


You'd also have to find useful code before this sequence, and in any 
case it's just a single ret where we used to have many. But maybe 
there's a one-byte trap that could be used instead.



Bernd

[ARM][testsuite] Make arm_neon_fp16 depend on arm_neon_ok

2016-06-17 Thread Christophe Lyon

Hi,

As discussed some time ago with Kyrylo (on IRC IIRC), the attached
patch makes sure that arm_neon_fp16_ok and arm_neonv2_ok effective
targets imply that arm_neon_ok passes, and use the corresponding
flags.

Without this patch, the 3 effective targets have different, possibly
inconsistent conditions. For instance, arm_neon_ok make sure that
__ARM_ARCH >= 7, but arm_neon_fp16_ok does not.

This led to failures on configurations not supporting neon, but where
arm_neon_fp16_ok passes as the test is less strict.
Rather than duplicating the same tests, I preferred to call
arm_neon_ok from the other places.

We then use the union of flags needed for arm_neon_ok and
arm_neon_fp16_ok to pass.

Tested on many arm configurations with no harm. It prevents
arm_neon_fp16 tests from passing when forcing -march=armv5t, that
seems coherent.

OK?

Christophe
gcc/testsuite/ChangeLog:

2016-06-17  Christophe Lyon  

* lib/target-supports.exp
(check_effective_target_arm_neon_fp16_ok_nocache): Call
arm_neon_ok and merge flags. Fix temporary test name.
(check_effective_target_arm_neonv2_ok_nocache): Call arm_neon_ok
and merge flags.
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index f4cb276..bbb5343 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2990,23 +2990,25 @@ proc check_effective_target_arm_crc_ok { } {
 
 proc check_effective_target_arm_neon_fp16_ok_nocache { } {
 global et_arm_neon_fp16_flags
+global et_arm_neon_flags
 set et_arm_neon_fp16_flags ""
-if { [check_effective_target_arm32] } {
+if { [check_effective_target_arm32]
+&& [check_effective_target_arm_neon_ok] } {
foreach flags {"" "-mfloat-abi=softfp" "-mfpu=neon-fp16"
   "-mfpu=neon-fp16 -mfloat-abi=softfp"
   "-mfp16-format=ieee"
   "-mfloat-abi=softfp -mfp16-format=ieee"
   "-mfpu=neon-fp16 -mfp16-format=ieee"
   "-mfpu=neon-fp16 -mfloat-abi=softfp -mfp16-format=ieee"} 
{
-   if { [check_no_compiler_messages_nocache arm_neon_fp_16_ok object {
+   if { [check_no_compiler_messages_nocache arm_neon_fp16_ok object {
#include "arm_neon.h"
float16x4_t
foo (float32x4_t arg)
{
   return vcvt_f16_f32 (arg);
}
-   } "$flags"] } {
-   set et_arm_neon_fp16_flags $flags
+   } "$et_arm_neon_flags $flags"] } {
+   set et_arm_neon_fp16_flags [concat $et_arm_neon_flags $flags]
return 1
}
}
@@ -3085,8 +3087,10 @@ proc check_effective_target_arm_v8_neon_ok { } {
 
 proc check_effective_target_arm_neonv2_ok_nocache { } {
 global et_arm_neonv2_flags
+global et_arm_neon_flags
 set et_arm_neonv2_flags ""
-if { [check_effective_target_arm32] } {
+if { [check_effective_target_arm32]
+&& [check_effective_target_arm_neon_ok] } {
foreach flags {"" "-mfloat-abi=softfp" "-mfpu=neon-vfpv4" 
"-mfpu=neon-vfpv4 -mfloat-abi=softfp"} {
if { [check_no_compiler_messages_nocache arm_neonv2_ok object {
#include "arm_neon.h"
@@ -3095,8 +3099,8 @@ proc check_effective_target_arm_neonv2_ok_nocache { } {
 {
   return vfma_f32 (a, b, c);
 }
-   } "$flags"] } {
-   set et_arm_neonv2_flags $flags
+   } "$et_arm_neon_flags $flags"] } {
+   set et_arm_neonv2_flags [concat $et_arm_neon_flags $flags]
return 1
}
}

Re: [PATCH, vec-tails 01/10] New compiler options

2016-06-17 Thread Ilya Enkovich

2016-06-16 8:06 GMT+03:00 Jeff Law :
> On 05/20/2016 05:40 AM, Ilya Enkovich wrote:
>>
>> 2016-05-20 14:17 GMT+03:00 Richard Biener :
>>>
>>> On Fri, May 20, 2016 at 11:50 AM, Ilya Enkovich 
>>> wrote:

 2016-05-20 12:26 GMT+03:00 Richard Biener :
>
> On Thu, May 19, 2016 at 9:36 PM, Ilya Enkovich 
> wrote:
>>
>> Hi,
>>
>> This patch introduces new options used for loop epilogues
>> vectorization.
>
>
> Why's that?  This is a bit too much for the casual user and if it is
> really necessary
> to control this via options then it is not fine-grained enough.
>
> Why doesn't the vectorizer/backend have enough info to decide this
> itself?

 I don't expect casual user to decide which modes to choose.  These
 controls are
 added for debugging and performance measurement purposes.  I see now I
 miss
 -ftree-vectorize-epilogues aliased to -ftree-vectorize-epilogues=all.
 Surely
 I expect epilogues and short loops vectorization be enabled by default
 on -O3
 or by -ftree-vectorize-loops.
>>>
>>>
>>> Can you make all these --params then?  I think to be useful to users we'd
>>> want
>>> them to be loop pragmas rather than options.
>>
>>
>> OK, I'll change it to params.  I didn't think about control via
>> pragmas but will do now.
>
> So the questions I'd like to see answered:
>
> 1. You've got 3 modes for epilogue vectorization.  Is this an artifact of
> not really having good heuristics yet for which mode to apply to a
> particular loop at this time?
>
> 2. Similarly for cost models.

All three modes are profitable in different situations.  Profitable mode depends
on a loop structure and target capabilities.  Ultimate goal is to have all three
modes enabled by default.  I can't state current heuristics are good enough
for all cases and targets and therefore don't enable epilogues vectorization
by default for now.  This is to be measured, analyzed and tuned in
time for GCC 7.1.

I add cost model simply to have an ability to force epilogue vectorization for
stability testing (force some mode of epilogue vectorization and check nothing
fails) and performance testing/tuning (try to find cases where we may benefit
from epilogue vectorization but don't due to bad cost model).  Also I don't
want to force epilogue vectorization for all loops for which vectorization is
forced using unlimited cost model because that may hurt performance for
simd loops.

>
> In the cover message you indicated you were getting expected gains of KNL,
> but not on Haswell.  Do you have any sense yet why you're not getting good
> resuls on Haswell yet?  For KNL are you getting those speedups with a
> generic set of options or are those with a custom set of options to set the
> mode & cost models?

Currently I have numbers collected on various suites for KNL machine.  Masking
mode (-ftree-vectorize-epilogues=mask) shows not bad results (dynamic
cost model,
-Ofast -flto -funroll-loops).  I don't see significant losses and there are few
significant gains.  For combine and nomask modes the result is not good enough
yet - there are several significant performance losses.  My guess is that
current threshold for combine is way too high and for nomask variant we better
choose the smallest vector size for epilogues instead of the next available
(use zmm for body and xmm for epilogue instead of zmmm for body and ymm for
epilogue).

ICC shows better results in these modes which makes me believe we can tune them
as well.  Overall nomask mode shows worse results comparing to options with
masking which is quite expected for KNL.

Unfortunately some big gains demonstrated by ICC are not reproducible
using GCC because we originally can't vectorize required hot loops.  E.g. on
200.sixtrack GCC has nothing and ICC has ~40% for all three modes.

I don't have the whole statistics for Haswell but synthetic tests show the
situation is really different from KNL.  Even for the 'perfect' iterations count
number (VF * 2 - 1) scalar version of epilogue shows the same result as a masked
one.  It means ratio of vector code performance vs. scalar code performance is
not as high as for KNL (KNL is more vector oriented and has weaker
scalar performance,
double vector size also matters here) and masking cost is higher for Haswell.
We still focus on AVX-512 targets more because of their rich masking
capabilities
and wider vector.

Thanks,
Ilya

>
> jeff

Re: i386/prologues: ROP mitigation for normal function epilogues

2016-06-17 Thread Jakub Jelinek

On Fri, Jun 17, 2016 at 12:06:48PM +0200, Bernd Schmidt wrote:
> This is another step to flesh out -mmitigate-rop for i386 a little more. The
> basic idea was (I think) Richard Henderson's: if we could arrange to have
> every return preceded by a leave instruction, it would make it harder to
> construct an attack since it takes away a certain amount of control over the
> stack pointer. I extended this to move the leave/ret pair to libgcc,
> preceded by a sequence of nops, so as to take away the possibility of
> jumping into the middle of an instruction preceding the leave/ret pair and
> thereby skipping the leave.

Do you really need to require frame pointer for this?
I mean, couldn't you instead use what you do if a function needs frame
pointer and otherwise just replace the original ret with
pushq   %rbp
movq%rsp, %rbp
jmp __rop_ret
?  Or would that defeat the purpose of the mitigation?
Though, I think it is very common to have functions that just don't do
anything in many of libraries and so the pushq %rbp; movq %rsp, %rbp; jmp 
__rop_ret
sequence would still very likely appear somewhere.

As for __rop_ret, if you are non-PLT jmp to it, I bet it must be in the same
executable or shared library as the code branching to it, so should be
.hidden.  Is libgcc.a really the best place for it though?  I mean, in
various cases we don't even link libgcc.a (sometimes we only link
libgcc_s.so.1).  Wouldn't it be better to emit the __rop_ret stuff into
every CU into a comdat section, like we do e.g. for the i686 PIC pads.
Is this stuff meant only for -m64, or also for 32-bit code?  If the latter,
then e.g. the i686 PIC pads is something where you also have ret without
leave before it.

Looking at nop; nop; 1: jmp 1b; leave; ret
if you branch into the middle of the jmp insn (0x3 below), there is:
   0:   90  nop
   1:   90  nop
   2:   eb fe   jmp0x2
   4:   c9  leaveq 
   5:   c3  retq   
and thus:
   3:   fe c9   dec%cl
   5:   c3  retq   
and thus if you don't mind decreasing %cl, you still have retq without leave
before it.  But I very likely just don't understand the ROP threat stuff
enough.

Jakub

i386/prologues: ROP mitigation for normal function epilogues

2016-06-17 Thread Bernd Schmidt

This is another step to flesh out -mmitigate-rop for i386 a little more. 
The basic idea was (I think) Richard Henderson's: if we could arrange to 
have every return preceded by a leave instruction, it would make it 
harder to construct an attack since it takes away a certain amount of 
control over the stack pointer. I extended this to move the leave/ret 
pair to libgcc, preceded by a sequence of nops, so as to take away the 
possibility of jumping into the middle of an instruction preceding the 
leave/ret pair and thereby skipping the leave.


Outside of the i386 changes, this adds a new optional prologue component 
that is always placed at function entry. There's already a use this for 
this in the the static chain on stack functionality.


This has survived a bootstrap and test both normally and with 
flag_mitigate_rop enabled by force in ix86_option_override. The former 
is completely clean. In the latter case, there are all sorts of 
scan-assembler testcases that fail, but that is to be expected. There's 
also some effect on guality, but other than that everything seems to be 
working.
These tests were with a very slightly earlier version that was missing 
the have_entry_prologue test in function.c; will retest with this one as 
well.


This has a performance impact when -mmitigate-rop is enabled, I made 
some measurements a while ago and it looks like it's about twice the 
impact of -fno-omit-frame-pointer.



Bernd
	* config/i386/i386-protos.h (ix86_expand_entry_prologue): Declare.
	* config/i386/i386.c (ix86_frame_pointer_required): True if
	flag_mitigate_rop.
	(ix86_compute_frame_layout): Determine whether to use ROP returns,
	and adjust save_regs_using_mov for it.
	(ix86_expand_entry_prologue): New function.
	(ix86_expand_prologue): Move parts from here into it.  Deal with
	the rop return variant.
	(ix86_expand_epilogue): Deal with the rop return variant.
	(ix86_expand_call): For sibcalls with flag_mitigate_rop, show a
	clobber and use of the hard frame pointer.
	(ix86_output_call_insn): For sibling calls, if using rop returns,
	emit a leave.
	(ix86_pad_returns): Skip if using rop returns.
	* config/i386/i386.h (struct machine_function): New field
	use_rop_ret.
	* config/i386/i386.md (sibcall peepholes): Disallow loads from
	memory locations involving the hard frame pointer.
	(return): Explicitly call gen_simple_return.
	(simple_return): Generate simple_return_leave_internal if
	necessary.
	(simple_return_internal): Assert we're not using rop returns.
	(simple_return_leave_internal): New pattern.
	(entry_prologue): New pattern.
	* function.c (make_entry_prologue_seq): New static function.
	(thread_prologue_and_epilogue_insns): Call it and emit the
	sequence.
	* target-insns.def (entry_prologue): Add.

libgcc/
	* config/i386/t-linux (LIB2ADD_ST): New, to add ropret.S.
	* config/i386/ropret.S: New file.

Index: gcc/config/i386/i386-protos.h
===
--- gcc/config/i386/i386-protos.h	(revision 237310)
+++ gcc/config/i386/i386-protos.h	(working copy)
@@ -33,6 +33,7 @@ extern void ix86_expand_prologue (void);
 extern void ix86_maybe_emit_epilogue_vzeroupper (void);
 extern void ix86_expand_epilogue (int);
 extern void ix86_expand_split_stack_prologue (void);
+extern void ix86_expand_entry_prologue (void);
 
 extern void ix86_output_addr_vec_elt (FILE *, int);
 extern void ix86_output_addr_diff_elt (FILE *, int, int);
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 237310)
+++ gcc/config/i386/i386.c	(working copy)
@@ -11529,6 +11530,9 @@ ix86_can_use_return_insn_p (void)
 static bool
 ix86_frame_pointer_required (void)
 {
+  if (flag_mitigate_rop)
+return true;
+
   /* If we accessed previous frames, then the generated code expects
  to be able to access the saved ebp value in our frame.  */
   if (cfun->machine->accesses_prev_frame)
@@ -12102,11 +12106,21 @@ ix86_compute_frame_layout (struct ix86_f
 	   = !expensive_function_p (count);
 }
 
-  frame->save_regs_using_mov
-= (TARGET_PROLOGUE_USING_MOVE && cfun->machine->use_fast_prologue_epilogue
-   /* If static stack checking is enabled and done with probes,
-	  the registers need to be saved before allocating the frame.  */
-   && flag_stack_check != STATIC_BUILTIN_STACK_CHECK);
+  cfun->machine->use_rop_ret = (flag_mitigate_rop
+&& !TARGET_SEH
+&& !stack_realign_drap
+&& crtl->args.pops_args == 0
+&& !crtl->calls_eh_return
+&& !ix86_static_chain_on_stack);
+
+  if (cfun->machine->use_rop_ret)
+frame->save_regs_using_mov = true;
+  else
+frame->save_regs_using_mov
+  = (TARGET_PROLOGUE_USING_MOVE && cfun->machine->use_fast_prologue_epilogue
+	 /* If static stack checking is enabled and done with probes,
+	the registers need to be saved before allocating the frame.  */
+	 && flag_stack_check != STATIC_BUILTIN_STACK_CHECK);
 
   /* Skip return

[Bug libstdc++/71562] Changing the hard coded size of _S_local_capacity in sso_string_base.h

2016-06-17 Thread developm...@faf-ltd.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71562

--- Comment #2 from Peter VARGA  ---
I disagree 100% with your comment!

I am definitely NOT a genius and I needed 5 minutes to find out where the hard
coded size is set.
Look at the GNU Glibc - you can crash an existing system when running
./configure with the wrong paths/settings. In the documentation is a warning.

You NEVER can be responsible if a programmer does NOT know what he is doing.
You can only warn them and then let it go.

I do NOT need this define - I set it already to my own value but because
vstring.h has the special status I thought it may be also for other programner
useful.

Why 15 and not 30? Do you understand what I mean?

1 2 >

1 - 100 of 124 matches

Mail list logo