[PATCH] [AArch64] Enable AES and cmp_branch fusion for Thunderx2t99
Hi, Please find attached the patch that adds AES and CMP_BRANCH fusion for Thunderx2t99. Bootstrapped and Regression tested on aarch64-thunderx2t99. Please review the patch and let us know if its okay? 2017-1-25 Naveen H.Sgcc * config/aarch64/aarch64.c (thunderx2t99_tunings): Improve vector initialization code gen. Thanks, Naveendiff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index f343d92..acaa975 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -780,7 +780,7 @@ static const struct tune_params thunderx2t99_tunings = _approx_modes, 4, /* memmov_cost. */ 4, /* issue_rate. */ - AARCH64_FUSE_NOTHING, /* fuseable_ops. */ + (AARCH64_FUSE_CMP_BRANCH | AARCH64_FUSE_AES_AESMC), /* fusible_ops */ 16, /* function_align. */ 8, /* jump_align. */ 16, /* loop_align. */
Go patch committed: improvements for type alias handling
This patch to the gofrontend improves the recently added type alias handling, and lets the compiler pass the tests on the dev.typealias branch of the gc compiler. We now give an error for an attempt to define a method on an imported type. We now give an error for each attempt to define a method on a builtin type. I adjusted the error messages to be closer to gc error messages. This changes the errors printed for test/fixedbugs/issue5089.go, but the change is an improvement. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 244835) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -fb609ff6d940768cf4db4ab7deb93b2ab686e45d +5c6c93f58e2aaae186bac5dcde9df1679d4896b1 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/gogo.cc === --- gcc/go/gofrontend/gogo.cc (revision 244460) +++ gcc/go/gofrontend/gogo.cc (working copy) @@ -1779,18 +1779,27 @@ Gogo::start_function(const std::string& while (rtype->named_type() != NULL && rtype->named_type()->is_alias()) - rtype = rtype->named_type()->real_type(); + rtype = rtype->named_type()->real_type()->forwarded(); if (rtype->is_error_type()) ret = Named_object::make_function(name, NULL, function); else if (rtype->named_type() != NULL) { - ret = rtype->named_type()->add_method(name, function); - if (!ret->is_function()) + if (rtype->named_type()->named_object()->package() != NULL) { - // Redefinition error. + go_error_at(type->receiver()->location(), + "may not define methods on non-local type"); ret = Named_object::make_function(name, NULL, function); } + else + { + ret = rtype->named_type()->add_method(name, function); + if (!ret->is_function()) + { + // Redefinition error. + ret = Named_object::make_function(name, NULL, function); + } + } } else if (rtype->forward_declaration_type() != NULL) { @@ -2247,8 +2256,14 @@ Gogo::define_global_names() if (global_no->is_type()) { if (no->type_declaration_value()->has_methods()) - go_error_at(no->location(), - "may not define methods for global type"); + { + for (std::vector::const_iterator p = +no->type_declaration_value()->methods()->begin(); + p != no->type_declaration_value()->methods()->end(); + p++) + go_error_at((*p)->location(), + "may not define methods on non-local type"); + } no->set_type_value(global_no->type_value()); } else Index: gcc/testsuite/go.test/test/fixedbugs/issue5089.go === --- gcc/testsuite/go.test/test/fixedbugs/issue5089.go (revision 244166) +++ gcc/testsuite/go.test/test/fixedbugs/issue5089.go (working copy) @@ -1,6 +1,6 @@ // errorcheck -// Copyright 2013 The Go Authors. All rights reserved. +// Copyright 2013 The Go Authors. All rights reserved. // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. @@ -8,7 +8,7 @@ package p -import "bufio" // GCCGO_ERROR "previous" +import "bufio" func (b *bufio.Reader) Buffered() int { // ERROR "non-local|redefinition" return -1
[PATCH] add missing overflow check to stpcpy (PR 79222)
In implementing the -Wstringop-overflow warning I missed stpcpy. The attached patch adds the required checking. Given how simple it is, does it qualify for GCC 7 despite stage 4? Martin PR middle-end/79222 - missing -Wstringop-overflow= on a stpcpy overflow gcc/ChangeLog: PR middle-end/79222 * builtins.c (expand_builtin_stpcpy): Check for buffer overflow. gcc/testsuite/ChangeLog: PR middle-end/79222 * gcc.dg/builtin-stringop-chk-4.c: Add test cases. * gcc.dg/pr79222.c: New test. Index: gcc/builtins.c === --- gcc/builtins.c (revision 244844) +++ gcc/builtins.c (working copy) @@ -3612,6 +3615,13 @@ expand_builtin_stpcpy (tree exp, rtx target, machi dst = CALL_EXPR_ARG (exp, 0); src = CALL_EXPR_ARG (exp, 1); + if (warn_stringop_overflow) +{ + tree destsize = compute_dest_size (dst, warn_stringop_overflow - 1); + check_sizes (OPT_Wstringop_overflow_, + exp, /*size=*/NULL_TREE, /*maxlen=*/NULL_TREE, src, destsize); +} + /* If return value is ignored, transform stpcpy into strcpy. */ if (target == const0_rtx && builtin_decl_implicit (BUILT_IN_STRCPY)) { Index: gcc/testsuite/gcc.dg/builtin-stringop-chk-4.c === --- gcc/testsuite/gcc.dg/builtin-stringop-chk-4.c (revision 244844) +++ gcc/testsuite/gcc.dg/builtin-stringop-chk-4.c (working copy) @@ -41,6 +41,9 @@ extern char* (strcat)(char*, const char*); #define strncat(d, s, n) (strncat ((d), (s), (n)), sink ((d))) extern char* (strncat)(char*, const char*, size_t); +#define stpcpy(d, s) (stpcpy ((d), (s)), sink ((d))) +extern char* (stpcpy)(char*, const char*); + #define strcpy(d, s) (strcpy ((d), (s)), sink ((d))) extern char* (strcpy)(char*, const char*); @@ -349,6 +352,49 @@ void test_strcpy_range (void) strcpy (buf + 17, S (0)); /* { dg-warning "writing 1 byte into a region of size 0 " } */ } +/* Same test_strcpy but for stpcpy. Verify that stpcpy with an unknown + source string doesn't cause warnings unless the destination has zero + size. */ + +void test_stpcpy (const char *src) +{ + struct A { char a[2]; char b[3]; } a; + + stpcpy (a.a, src); + stpcpy (a.a + 1, src); + + stpcpy (a.a + 2, src);/* { dg-warning "writing at least 1 byte into a region of size 0 " "stpcpy into empty substring" { xfail *-*-* } } */ + + /* This does work. */ + stpcpy (a.a + 5, src);/* { dg-warning "writing at least 1 byte into a region of size 0 " } */ + + /* As does this. */ + stpcpy (a.a + 17, src);/* { dg-warning "writing at least 1 byte into a region of size 0 " } */ +} + +/* Same test_strcpy but for stpcpy. Test stpcpy with a non-constant source + string of length in a known range. */ + +void test_stpcpy_range (void) +{ + char buf[5]; + + stpcpy (buf, S (0)); + stpcpy (buf, S (1)); + stpcpy (buf, S (2)); + stpcpy (buf, S (4)); + stpcpy (buf, S (5)); /* { dg-warning "writing 6 bytes into a region of size 5 " } */ + stpcpy (buf, S (6)); /* { dg-warning "writing 7 bytes into a region of size 5 " } */ + stpcpy (buf, S (7)); /* { dg-warning "writing 8 bytes into a region of size 5 " } */ + stpcpy (buf, S (8)); /* { dg-warning "writing 9 bytes into a region of size 5 " } */ + stpcpy (buf, S (9)); /* { dg-warning "writing 10 bytes into a region of size 5 " } */ + stpcpy (buf, S (10)); /* { dg-warning "writing 11 bytes into a region of size 5 " } */ + + stpcpy (buf + 5, S (0)); /* { dg-warning "writing 1 byte into a region of size 0 " } */ + + stpcpy (buf + 17, S (0)); /* { dg-warning "writing 1 byte into a region of size 0 " } */ +} + /* Test strncat with an argument referencing a non-constant string of lengths in a known range. */ Index: gcc/testsuite/gcc.dg/pr79222.c === --- gcc/testsuite/gcc.dg/pr79222.c (revision 0) +++ gcc/testsuite/gcc.dg/pr79222.c (working copy) @@ -0,0 +1,11 @@ +/* PR middle-end/79222 - missing -Wstringop-overflow= on a stpcpy overflow + { dg-do compile } + { dg-options "-O2" } */ + +char d[3]; + +char* f (int i) +{ + const char *s = i < 0 ? "01234567" : "9876543210"; + return __builtin_stpcpy (d, s); /* { dg-warning ".__builtin_stpcpy. writing 9 bytes into a region of size 3 overflows the destination" } */ +}
[PATCH 2/2] ARC: Better creation of __NPS400__ define
The __NPS400__ define is currently created in CPP_SPEC unlike the other target defines, which are created in arc-c.def. Further, the current __NPS400__ define is (currently) only created when -mcpu=nps400 is passed, which is fine, except that if GCC is configured using --with-cpu=nps400 then the -mcpu option is not required and the __NPS400__ define will not be created. This commit moves the __NPS400__ define into arc-c.def inline with all of the other target defines, and removes the code in CPP_SPEC that used to create the define. In order to support the creation of the define in arc-c.def, a new TARGET_NPS400 macro is created in arc.h. gcc/ChangeLog: * config/arc/arc-c.def: Add __NPS400__ definition. * config/arc/arc.h (CPP_SPEC): Don't define __NPS400__ here. (TARGET_NPS400): Define. --- gcc/ChangeLog| 6 ++ gcc/config/arc/arc-c.def | 1 + gcc/config/arc/arc.h | 9 +++-- 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/gcc/config/arc/arc-c.def b/gcc/config/arc/arc-c.def index 4903d84..8c5097e 100644 --- a/gcc/config/arc/arc-c.def +++ b/gcc/config/arc/arc-c.def @@ -20,6 +20,7 @@ ARC_C_DEF ("__ARC600__", TARGET_ARC600) ARC_C_DEF ("__ARC601__", TARGET_ARC601) ARC_C_DEF ("__ARC700__", TARGET_ARC700) +ARC_C_DEF ("__NPS400__", TARGET_NPS400) ARC_C_DEF ("__ARCEM__",TARGET_EM) ARC_C_DEF ("__ARCHS__",TARGET_HS) ARC_C_DEF ("__ARC_ATOMIC__", TARGET_ATOMIC) diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h index d6d9012..e269b4d 100644 --- a/gcc/config/arc/arc.h +++ b/gcc/config/arc/arc.h @@ -92,8 +92,7 @@ along with GCC; see the file COPYING3. If not see %{mmac-d16:-D__Xxmac_d16} %{mmac-24:-D__Xxmac_24} \ %{mdsp-packa:-D__Xdsp_packa} %{mcrc:-D__Xcrc} %{mdvbf:-D__Xdvbf} \ %{mtelephony:-D__Xtelephony} %{mxy:-D__Xxy} %{mmul64: -D__Xmult32} \ -%{mlock:-D__Xlock} %{mswape:-D__Xswape} %{mrtsc:-D__Xrtsc} \ -%{mcpu=nps400:-D__NPS400__}" +%{mlock:-D__Xlock} %{mswape:-D__Xswape} %{mrtsc:-D__Xrtsc}" #define CC1_SPEC "\ %{EB:%{EL:%emay not use both -EB and -EL}} \ @@ -223,6 +222,12 @@ extern const char *arc_cpu_to_as (int argc, const char **argv); && (!TARGET_BARREL_SHIFTER)) #define TARGET_ARC700 (arc_selected_cpu->arch_info->arch_id\ == BASE_ARCH_700) +/* An NPS400 is a specialisation of ARC700, so it is correct for NPS400 + TARGET_ARC700 is true, and TARGET_NPS400 is true. */ +#define TARGET_NPS400 ((arc_selected_cpu->arch_info->arch_id \ + == BASE_ARCH_700) \ + && (arc_selected_cpu->processor \ + == PROCESSOR_nps400)) #define TARGET_EM (arc_selected_cpu->arch_info->arch_id == BASE_ARCH_em) #define TARGET_HS (arc_selected_cpu->arch_info->arch_id == BASE_ARCH_hs) #define TARGET_V2 (TARGET_EM || TARGET_HS) -- 2.4.11
[PATCH 0/2] [ARC] Define __NPS400__ properly
I noticed that the __NPS400__ define was not being created properly, when the toolchain is configured with --with-cpu=nps400 then the define will not be created (unless --mcpu=nps400 is also used). I thought the fix would be easy, __NPS400__ was being created in the CPP_SPEC, in contrast to similar defines like __ARC700__, which are created in arc-c.def, so all I wanted to do was to move the definition into arc-c.def. Turned out this was slihtly more involved than I thought, which is why this ended up as two patches. The first patch restrucures how we track the currently selected cpu, but should not make any user visible changes. The second patch uses the changes in the first patch to create the __NPS400__ properly. I've run the GCC with no regressions, and this should have no impact for any target other than arc. I don't know if you think this will be too much to slip in at this stage in the release, but it would be great if I could.. Thanks, Andrew --- Andrew Burgess (2): ARC: Make arc_selected_cpu global ARC: Better creation of __NPS400__ define gcc/ChangeLog | 31 gcc/config/arc/arc-arch.h | 50 ++--- gcc/config/arc/arc-c.def| 1 + gcc/config/arc/arc.c| 35 ++- gcc/config/arc/arc.h| 24 ++ gcc/config/arc/driver-arc.c | 2 +- 6 files changed, 88 insertions(+), 55 deletions(-) -- 2.4.11
[PATCH 1/2] ARC: Make arc_selected_cpu global
Currently we only make the base_architecture globally available, this means we can tell if we have selected arc700/archs/etc but it's not possible to tell if the user has selected a specific cpu variant, for example nps400. One problem this causes is, for example, in arc-c.def, if we want to add an __NPS400__ define then we need a flag we can check to determine if this is the right thing to do. In this commit the arc_selected_cpu variable (previously local within arc.c) has been made global. Two other variables arc_base_cpu and arc_selected_arch have been deleted, all of this information can be found within (or through) arc_selected_cpu. All uses of arc_base_cpu and arc_selected_arch have been updated. This commit does not introduce any new defines (like __NPS400__), this is just a restructuring commit. The declaration of arc_selected_cpu has moved into arc-arch.h, in contrast to the declaration of arc_base_cpu which was previously in arc.h. This avoids a compilation issue when building libgcc, as the structure and enums declared in arc-arch.h are not included for libgcc then declaring an arc_selected_cpu (a struct type) in arc.h would result in an unknown struct error. We got away with this for arc_base_cpu as that was an enum type. The declaration of arc_selected_cpu in arc.h could have been wrapped in a '#ifndef IN_LIBGCC2 ... #endif', but it felt neater to simply move the declaration into arc-arch.h. gcc/ChangeLog: * config/arc/arc-arch.h (arc_arch_t): Move unchanged to earlier in file. (arc_cpu_t): Change base_architecture field, arch, to a arc_arc_t pointer, arch_info. (arc_cpu_types): Fill the arch_info field with a pointer into the arc_arch_types table. (arc_selected_cpu): Declare. * config/arc/arc.c (arc_selected_cpu): Make global. (arc_selected_arch): Delete. (arc_base_cpu): Delete. (arc_override_options): Remove references to deleted variables, update access to arch information. (ARC_OPT): Update access to arch information. (ARC_OPTX): Likewise. * config/arc/arc.h (arc_base_cpu): Remove declaration. (TARGET_ARC600): Update access to arch information. (TARGET_ARC601): Likewise. (TARGET_ARC700): Likewise. (TARGET_EM): Likewise. (TARGET_HS): Likewise. * config/arc/driver-arc.c (arc_cpu_to_as): Update access to arch information. --- gcc/ChangeLog | 25 +++ gcc/config/arc/arc-arch.h | 50 ++--- gcc/config/arc/arc.c| 35 ++- gcc/config/arc/arc.h| 15 +++--- gcc/config/arc/driver-arc.c | 2 +- 5 files changed, 74 insertions(+), 53 deletions(-) diff --git a/gcc/config/arc/arc-arch.h b/gcc/config/arc/arc-arch.h index 832338b..2344a4b 100644 --- a/gcc/config/arc/arc-arch.h +++ b/gcc/config/arc/arc-arch.h @@ -47,6 +47,23 @@ enum base_architecture BASE_ARCH_END }; +/* Architecture specific propoerties. */ + +typedef struct +{ + /* Architecture name. */ + const char *const name; + + /* Architecture class. */ + enum base_architecture arch_id; + + /* All allowed flags for this architecture. */ + const unsigned long long flags; + + /* Default flags for this architecture. It is a subset of + FLAGS. */ + const unsigned long long dflags; +} arc_arch_t; /* Tune variants. Needs to match the attr_tune enum. */ @@ -66,7 +83,7 @@ typedef struct const char *const name; /* Architecture class. */ - enum base_architecture arch; + const arc_arch_t *arch_info; /* Specific processor type. */ enum processor_type processor; @@ -76,28 +93,8 @@ typedef struct /* Tune value. */ enum arc_tune_attr tune; -} arc_cpu_t; - - -/* Architecture specific propoerties. */ - -typedef struct -{ - /* Architecture name. */ - const char *const name; - - /* Architecture class. */ - enum base_architecture arch; - - /* All allowed flags for this architecture. */ - const unsigned long long flags; - - /* Default flags for this architecture. It is a subset of - FLAGS. */ - const unsigned long long dflags; -} arc_arch_t; - +} arc_cpu_t; const arc_arch_t arc_arch_types[] = { @@ -111,13 +108,16 @@ const arc_arch_t arc_arch_types[] = const arc_cpu_t arc_cpu_types[] = { -{"none", BASE_ARCH_NONE, PROCESSOR_NONE, 0, ARC_TUNE_NONE}, +{"none", NULL, PROCESSOR_NONE, 0, ARC_TUNE_NONE}, #define ARC_CPU(NAME, ARCH, FLAGS, TUNE) \ -{#NAME, BASE_ARCH_##ARCH, PROCESSOR_##NAME, FLAGS, ARC_TUNE_##TUNE}, +{#NAME, _arch_types [BASE_ARCH_##ARCH], PROCESSOR_##NAME, FLAGS, ARC_TUNE_##TUNE }, #include "arc-cpus.def" #undef ARC_CPU -{NULL, BASE_ARCH_END, PROCESSOR_NONE, 0, ARC_TUNE_NONE} +{NULL, NULL, PROCESSOR_NONE, 0, ARC_TUNE_NONE} }; +/* Currently selected cpu type. */ +extern const arc_cpu_t *arc_selected_cpu; + #endif
[PATCH] Add --with-gcc-major-version-only support to libhsail-rt
Hi! Apparently the configury of this library has been copied over before the PR79046 changes were done, the following patch updates it. Ok for trunk? Though, I wonder why configure.ac/Makefile.am have been based on one of the only 2 that aren't GPL licensed, there are over dozen other libraries that have very simple GPL configure.ac and Makefile.am, can't we just rewrite those based on those other files? 2017-01-24 Jakub JelinekPR other/79046 * configure.ac: Add GCC_BASE_VER. * Makefile.am (gcc_version): Use @get_gcc_base_ver@ instead of cat to get version from BASE-VER file. (ACLOCAL_AMFLAGS): Set to -I .. -I ../config . * aclocal.m4: Regenerated. * configure: Regenerated. * Makefile.in: Regenerated. --- libhsail-rt/configure.ac.jj 2017-01-24 23:29:11.0 +0100 +++ libhsail-rt/configure.ac2017-01-24 23:48:13.743605310 +0100 @@ -147,5 +147,8 @@ AC_CONFIG_HEADER(target-config.h) AC_CHECK_SIZEOF([int]) AC_CHECK_SIZEOF([void*]) +# Determine what GCC version number to use in filesystem paths. +GCC_BASE_VER + # Must be last AC_OUTPUT --- libhsail-rt/Makefile.am.jj 2017-01-24 23:29:12.0 +0100 +++ libhsail-rt/Makefile.am 2017-01-24 23:49:42.93518 +0100 @@ -44,13 +44,13 @@ AUTOMAKE_OPTIONS = foreign subdir-objects -gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER) +gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER) MAINT_CHARSET = latin1 mkinstalldirs = $(SHELL) $(toplevel_srcdir)/mkinstalldirs -ACLOCAL_AMFLAGS = -I m4 +ACLOCAL_AMFLAGS = -I .. -I ../config WARN_CFLAGS = $(WARN_FLAGS) $(WERROR) @@ -120,5 +120,3 @@ AM_MAKEFLAGS = \ "DESTDIR=$(DESTDIR)" MAKEOVERRIDES= - - --- libhsail-rt/aclocal.m4.jj 2017-01-24 23:29:12.0 +0100 +++ libhsail-rt/aclocal.m4 2017-01-24 23:49:56.352268889 +0100 @@ -1,7 +1,8 @@ -# generated automatically by aclocal 1.11.1 -*- Autoconf -*- +# generated automatically by aclocal 1.11.6 -*- Autoconf -*- # Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, -# 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. +# 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, +# Inc. # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, # with or without modifications, as long as this notice is preserved. @@ -19,12 +20,15 @@ You have another version of autoconf. I If you have problems, you may need to regenerate the build system entirely. To do so, use the procedure documented by the package, typically `autoreconf'.])]) -# Copyright (C) 2002, 2003, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. +# Copyright (C) 2002, 2003, 2005, 2006, 2007, 2008, 2011 Free Software +# Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, # with or without modifications, as long as this notice is preserved. +# serial 1 + # AM_AUTOMAKE_VERSION(VERSION) # # Automake X.Y traces this macro to ensure aclocal.m4 has been @@ -57,12 +61,14 @@ _AM_AUTOCONF_VERSION(m4_defn([AC_AUTOCON # AM_AUX_DIR_EXPAND -*- Autoconf -*- -# Copyright (C) 2001, 2003, 2005 Free Software Foundation, Inc. +# Copyright (C) 2001, 2003, 2005, 2011 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, # with or without modifications, as long as this notice is preserved. +# serial 1 + # For projects using AC_CONFIG_AUX_DIR([foo]), Autoconf sets # $ac_aux_dir to `$srcdir/foo'. In other projects, it is set to # `$srcdir', `$srcdir/..', or `$srcdir/../..'. @@ -144,14 +150,14 @@ AC_CONFIG_COMMANDS_PRE( Usually this means the macro was only invoked conditionally.]]) fi])]) -# Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2009 -# Free Software Foundation, Inc. +# Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2009, +# 2010, 2011 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, # with or without modifications, as long as this notice is preserved. -# serial 10 +# serial 12 # There are a few dirty hacks below to avoid letting `AC_PROG_CC' be # written in clear, in which case automake, when reading aclocal.m4, @@ -191,6 +197,7 @@ AC_CACHE_CHECK([dependency style of $dep # instance it was reported that on HP-UX the gcc test will end up # making a dummy file named `D' -- because `-MD' means `put the output # in D'. + rm -rf conftest.dir mkdir conftest.dir # Copy depcomp to subdir because otherwise we won't find it if we're # using a relative directory. @@ -255,7 +262,7 @@ AC_CACHE_CHECK([dependency style of $dep
[C++ PATCH] Fix -Winvalid-offsetof warning (PR c++/68727)
Hi! r215070 moved -Winvalid-offsetof pedwarn from build_class_member_access_expr (where it warned even about offsetof-like hand-written constructs) to finish_offsetof. That is fine, the problem is that while build_class_member_access_expr saw the original first argument to __builtin_offsetof (the object type), finish_offsetof can see a different type and the first argument of COMPONENT_REF is a result of various folding (e.g. build_base_path (prematurely?) folds trees etc.), so it would be too hard to reliably discover the original type. Now, if the type finish_offsetof sees is e.g. a std layout type, while the original type is not, we don't emit a warning that we should. Instead of trying to discover the original type from the folded expression, the following patch passes the type (well, a cast of null pointer to pointer to that type, as that is something we have handy) to finish_offsetof and for template instantiation saves it in a new OFFSETOF_EXPR argument. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2017-01-24 Jakub JelinekPR c++/68727 * cp-tree.def (OFFSETOF_EXPR): Bump number of operands to 2. * cp-tree.h (finish_offsetof): Add OBJECT_PTR argument. * parser.c (cp_parser_builtin_offsetof): Pass result of build_static_cast of null_pointer_node to finish_offsetof. * semantics.c (finish_offsetof): Add OBJECT_PTR argument, use it for -Winvalid-offsetof pedwarn instead of trying to guess original offsetof type from EXPR. Save OBJECT_PTR as a new second operand to OFFSETOF_EXPR. * pt.c (tsubst_copy_and_build) : Adjust finish_offsetof caller, pass the second operand of OFFSETOF_EXPR as OBJECT_PTR. * g++.dg/other/offsetof8.C: Add expected error. * g++.dg/other/offsetof9.C: New test. --- gcc/cp/cp-tree.def.jj 2017-01-10 08:12:46.0 +0100 +++ gcc/cp/cp-tree.def 2017-01-24 14:27:00.907820968 +0100 @@ -333,7 +333,7 @@ DEFTREECODE (EXPR_STMT, "expr_stmt", tcc DEFTREECODE (TAG_DEFN, "tag_defn", tcc_expression, 0) /* Represents an 'offsetof' expression during template expansion. */ -DEFTREECODE (OFFSETOF_EXPR, "offsetof_expr", tcc_expression, 1) +DEFTREECODE (OFFSETOF_EXPR, "offsetof_expr", tcc_expression, 2) /* Represents an '__builtin_addressof' expression during template expansion. This is similar to ADDR_EXPR, but it doesn't invoke --- gcc/cp/cp-tree.h.jj 2017-01-21 02:26:07.0 +0100 +++ gcc/cp/cp-tree.h2017-01-24 14:14:51.132548587 +0100 @@ -6485,7 +6485,7 @@ extern tree finish_underlying_type extern tree calculate_bases (tree); extern tree finish_bases(tree, bool); extern tree calculate_direct_bases (tree); -extern tree finish_offsetof(tree, location_t); +extern tree finish_offsetof(tree, tree, location_t); extern void finish_decl_cleanup(tree, tree); extern void finish_eh_cleanup (tree); extern void emit_associated_thunks (tree); --- gcc/cp/parser.c.jj 2017-01-21 02:26:06.0 +0100 +++ gcc/cp/parser.c 2017-01-24 14:18:22.482716966 +0100 @@ -9498,11 +9498,12 @@ cp_parser_builtin_offsetof (cp_parser *p token = cp_lexer_peek_token (parser->lexer); /* Build the (type *)null that begins the traditional offsetof macro. */ - expr = build_static_cast (build_pointer_type (type), null_pointer_node, -tf_warning_or_error); + tree object_ptr += build_static_cast (build_pointer_type (type), null_pointer_node, +tf_warning_or_error); /* Parse the offsetof-member-designator. We begin as if we saw "expr->". */ - expr = cp_parser_postfix_dot_deref_expression (parser, CPP_DEREF, expr, + expr = cp_parser_postfix_dot_deref_expression (parser, CPP_DEREF, object_ptr, true, , token->location); while (true) { @@ -9554,7 +9555,7 @@ cp_parser_builtin_offsetof (cp_parser *p loc = make_location (loc, start_loc, finish_loc); /* The result will be an INTEGER_CST, so we need to explicitly preserve the location. */ - expr = cp_expr (finish_offsetof (expr, loc), loc); + expr = cp_expr (finish_offsetof (object_ptr, expr, loc), loc); failure: parser->integral_constant_expression_p = save_ice_p; --- gcc/cp/semantics.c.jj 2017-01-21 02:26:06.0 +0100 +++ gcc/cp/semantics.c 2017-01-24 14:30:05.024371882 +0100 @@ -3995,13 +3995,13 @@ finish_bases (tree type, bool direct) fold_offsetof. */ tree -finish_offsetof (tree expr, location_t loc) +finish_offsetof (tree object_ptr, tree expr, location_t loc) { /* If we're processing a template, we can't finish the semantics yet. Otherwise we can fold the entire expression now. */ if (processing_template_decl) { - expr = build1 (OFFSETOF_EXPR,
Re: [PATCH 1/4] BRIG (HSAIL) frontend: configuration file changes and misc
On Tue, 24 Jan 2017, Pekka Jääskeläinen wrote: > Hi, > > On Tue, Jan 24, 2017 at 7:26 PM, Joseph Myerswrote: > > Since front ends can't be enabled or disabled on a per-multilib basis, if > > you want to support any x86/x86_64 GNU/Linux configuration you need to > > support all of them here (and possibly disable runtime libraries for > > particular multilibs). That is, this should be *more* general and also > > allow i[[3456789]]86-*-linux* because that can have a -m64 multilib. > > OK. Patch attached. Does it look right now? That seems correct on the general principle of treating a particular ABI configuration the same however it's achieved through a combination of target triplet, configure options and multilib options. -- Joseph S. Myers jos...@codesourcery.com
Re: [libcc1] add support for C++
On Tue, Jan 24, 2017 at 3:43 PM, Alexandre Olivawrote: > On Jan 24, 2017, Jason Merrill wrote: >> On Mon, Jan 23, 2017 at 7:21 PM, Alexandre Oliva wrote: >>> On Jan 23, 2017, Jason Merrill wrote: >>> >>> + If the newly-created namespace is to be an inline namespace, after >>> + push_namespace, get the nested namespace decl with >>> + get_current_binding_level, pop back to the enclosing namespace, >>> + call using_namespace with INLINE_P, and then push to the inline >>> + namespace again. */ > >> This seems like unnecessary contortions to expect of the caller >> (i.e. GDB). Let's add a new front end function to handle this. > > Heh, it's really nothing compared with some other contortions required > to navigate binding levels, e.g. to reenter a function scope. > > I didn't want to add redundant functionality to the libcc1 API, and we'd > need the ability to separate the introduction or reentry in a namespace, > from a using directive with attribute strong. And since inline > namespaces use the same implementation as that, I figured it made sense > to use the same interface for both. >>> > Besides, push_namespace is called a lot more often than using_namespace, > so it makes sense to keep it simpler, and leave the extra parameter for > the less common operation. > > Another argument to keep things as they are is that the inline-ness of a > namespace is not a property related with entering the namespace, but > rather with how names from it are brought into the other namespace. >>> Well, it's a property of the namespace. >>> >>> Even in the case of 'strong' using directives? I didn't think so. > >> 'strong' using directives should really go away, they were a brief >> stop on the way to inline namespaces. We really shouldn't introduce >> more functionality based on them. I'll add a deprecated warning >> now... > > Ok, but where does that leave us WRT libcc1? Should we make it > impossible for GDB to convey this construct to libcc1 if it finds it in > an existing program? (if it is represented in debug information at all) It isn't represented in the debug information. > If we make it so, then a new entry point that makes the current > namespace inline should fit the bill. Then it would always make it > inline to the immediately enclosing namespace. Sounds reasonable. >> For operator functions, set GCC_CP_FLAG_SPECIAL_FUNCTION in >>> + SYM_KIND, in addition to any other applicable flags, and pass as >>> + NAME a string starting with the two-character mangling for operator >>> + name > >> Perhaps the compiler side should handle this transparently? > > Sorry, I don't understand what you're suggesting. Can you please > elaborate? >>> I mean have the compiler recognize that the name is magic and infer that it's a special function. >>> >>> I'd rather not have accidental namespace collisions: there are several >>> layers with different conventions at play, so I decided to mark such >>> things explicitly so as to be able to catch misuses. Consider that we >>> have GCC's internal identifier representation, mangled identifier >>> representation, names exported in debug information, and GDB's internal >>> representation. When we get a special name, we want to be sure there >>> was not a failure to convert between one's internal representation and >>> libcc1's conventions. That's what motivated me to mark them explicitly. > >> It sounds like you're saying that you want the user to provide the >> same information two different ways in order to make sure they're >> getting it right. That seems kind of cumbersome. > > I can't see it that way. The information is not provided in two > different ways. Consider, for example, the string used to denote > constructors. It's perfectly legitimate to name a member function "C4", > but we also use this representation to name constructors, so we need at > least an additional bit to tell which space the name belongs to. We > could make the encoding part of the string ('_' prefixes, mangling, > whatever), but I thought it was better to leave the names unmangled, and > use a separate bit flag to "escape" the names into the special functions > space. > > OF course we could have gone with other conventions. One I considered > was looking for blanks in the names: they can't appear in regular names, > and we could mandate them to be present in special function names. In > the end, I went for the separate bit. > > >>> Adopting the name mangling conventions, rather than some enumeration or >>> so, is justified by the need for some arbitrary convention, and reusing >>> an existing one (that is NOT present in debug information names) > >> Isn't it present in DW_AT_linkage_name? > > It's part of the mangled name, yes, but you'd have to
[committed] Fix ICE with C++17 decomposition of invisiref parameter (PR c++/79205)
Hi! The following testcase ICEd because genericization handled an is_invisiref_parm PARM_DECL in DECL_INITIAL of the decomposition artificial variable twice. Fixed by making sure we don't walk the INDIRECT_REFs added by convert_from_reference for is_invisiref_parm PARM_DECLs. Bootstrapped/regtested on x86_64-linux and i686-linux, approved by Jason in the PR, committed to trunk. 2017-01-24 Jakub JelinekPR c++/79205 * cp-gimplify.c (cp_genericize_r): Add result of convert_from_reference on invisiref parm to p_set. * g++.dg/cpp1z/decomp22.C: New test. * g++.dg/cpp1z/decomp23.C: New test. --- gcc/cp/cp-gimplify.c.jj 2017-01-21 02:26:06.0 +0100 +++ gcc/cp/cp-gimplify.c2017-01-24 10:44:58.395850677 +0100 @@ -1107,6 +1107,7 @@ cp_genericize_r (tree *stmt_p, int *walk if (wtd->handle_invisiref_parm_p && is_invisiref_parm (stmt)) { *stmt_p = convert_from_reference (stmt); + p_set->add (*stmt_p); *walk_subtrees = 0; return NULL; } --- gcc/testsuite/g++.dg/cpp1z/decomp22.C.jj2017-01-24 10:45:51.311156064 +0100 +++ gcc/testsuite/g++.dg/cpp1z/decomp22.C 2017-01-24 10:58:16.241237946 +0100 @@ -0,0 +1,21 @@ +// PR c++/79205 +// { dg-do compile { target c++11 } } +// { dg-options "" } + +template struct B; +template struct B { int b; }; +template struct C { B<0, E...> c; C (C &) = default; C (C &&); }; +template struct tuple_size; +template <> struct tuple_size { static constexpr int value = 1; }; +template struct tuple_element; +template +struct tuple_element<0, C > { typedef int type; }; +template +int & (C &&); + +int +foo (C t) +{ + auto[x0] = t;// { dg-warning "decomposition declaration only available with" "" { target c++14_down } } + return x0; +} --- gcc/testsuite/g++.dg/cpp1z/decomp23.C.jj2017-01-24 10:50:47.623266415 +0100 +++ gcc/testsuite/g++.dg/cpp1z/decomp23.C 2017-01-24 10:58:25.816108949 +0100 @@ -0,0 +1,12 @@ +// PR c++/79205 +// { dg-do compile { target c++11 } } +// { dg-options "" } + +#include + +int +foo (std::tuple t) +{ + auto [x0] = t; // { dg-warning "decomposition declaration only available with" "" { target c++14_down } } + return x0; +} Jakub
Re: [PATCH][testsuite] Require shared effective target for some lto.exp tests
On Jan 24, 2017, at 6:16 AM, Kyrill Tkachovwrote: > > The tests in this patch fail for me on aarch64-none-elf with: > relocation R_AARCH64_ADR_PREL_PG_HI21 against external symbol `_impure_ptr' > can not be used when making a shared object; recompile with -fPIC > > I believe since the tests pass -shared to the linker they should be gated on > the 'shared' effective target? > With this patch these tests appear as UNSUPPORTED on aarch64-none-elf rather > than FAILing. > > Ok for trunk? Ok.
Re: [libcc1] add support for C++
On Jan 24, 2017, Jason Merrillwrote: > On Mon, Jan 23, 2017 at 7:21 PM, Alexandre Oliva wrote: >> On Jan 23, 2017, Jason Merrill wrote: >> >> + If the newly-created namespace is to be an inline namespace, after >> + push_namespace, get the nested namespace decl with >> + get_current_binding_level, pop back to the enclosing namespace, >> + call using_namespace with INLINE_P, and then push to the inline >> + namespace again. */ > This seems like unnecessary contortions to expect of the caller > (i.e. GDB). Let's add a new front end function to handle this. Heh, it's really nothing compared with some other contortions required to navigate binding levels, e.g. to reenter a function scope. I didn't want to add redundant functionality to the libcc1 API, and we'd need the ability to separate the introduction or reentry in a namespace, from a using directive with attribute strong. And since inline namespaces use the same implementation as that, I figured it made sense to use the same interface for both. >> Besides, push_namespace is called a lot more often than using_namespace, so it makes sense to keep it simpler, and leave the extra parameter for the less common operation. Another argument to keep things as they are is that the inline-ness of a namespace is not a property related with entering the namespace, but rather with how names from it are brought into the other namespace. >> >>> Well, it's a property of the namespace. >> >> Even in the case of 'strong' using directives? I didn't think so. > 'strong' using directives should really go away, they were a brief > stop on the way to inline namespaces. We really shouldn't introduce > more functionality based on them. I'll add a deprecated warning > now... Ok, but where does that leave us WRT libcc1? Should we make it impossible for GDB to convey this construct to libcc1 if it finds it in an existing program? (if it is represented in debug information at all) If we make it so, then a new entry point that makes the current namespace inline should fit the bill. Then it would always make it inline to the immediately enclosing namespace. >> For operator functions, set GCC_CP_FLAG_SPECIAL_FUNCTION in >> + SYM_KIND, in addition to any other applicable flags, and pass as >> + NAME a string starting with the two-character mangling for operator >> + name > Perhaps the compiler side should handle this transparently? Sorry, I don't understand what you're suggesting. Can you please elaborate? >> >>> I mean have the compiler recognize that the name is magic and infer >>> that it's a special function. >> >> I'd rather not have accidental namespace collisions: there are several >> layers with different conventions at play, so I decided to mark such >> things explicitly so as to be able to catch misuses. Consider that we >> have GCC's internal identifier representation, mangled identifier >> representation, names exported in debug information, and GDB's internal >> representation. When we get a special name, we want to be sure there >> was not a failure to convert between one's internal representation and >> libcc1's conventions. That's what motivated me to mark them explicitly. > It sounds like you're saying that you want the user to provide the > same information two different ways in order to make sure they're > getting it right. That seems kind of cumbersome. I can't see it that way. The information is not provided in two different ways. Consider, for example, the string used to denote constructors. It's perfectly legitimate to name a member function "C4", but we also use this representation to name constructors, so we need at least an additional bit to tell which space the name belongs to. We could make the encoding part of the string ('_' prefixes, mangling, whatever), but I thought it was better to leave the names unmangled, and use a separate bit flag to "escape" the names into the special functions space. OF course we could have gone with other conventions. One I considered was looking for blanks in the names: they can't appear in regular names, and we could mandate them to be present in special function names. In the end, I went for the separate bit. >> Adopting the name mangling conventions, rather than some enumeration or >> so, is justified by the need for some arbitrary convention, and reusing >> an existing one (that is NOT present in debug information names) > Isn't it present in DW_AT_linkage_name? It's part of the mangled name, yes, but you'd have to look for it. It's not something you'd get to by accident or mistake. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin
Re: [PATCH, rs6000] Fix for entries in table of overloaded built-in functions
On Tue, Jan 24, 2017 at 10:09:23AM -0800, Carl E. Love wrote: > > Do we need a separate testcase to check for this? Or do those specific > > builtins need better testcases? Or was the bug obvious already? [ ... ] > Once the bug for the ALTIVEC_BUILTIN_VEC_PACKS built-in was found, I > wrote a perl script to scan through the entire table looking for the > issue with any other built-in functions. I think we should have the compiler itself do this check, with flag_checking enabled or similar. It is too easy to get this wrong it seems :-( > Please let me know if you want me to go ahead with the adding the vec_packs() > test cases > to the patch or not. Again, not so sure about the vec_vgbbd test cases. If you have tests ready, then sure, add them please (in a separate patch is fine). Segher
Re: [PATCH] BRIG frontend: request for a global review
On Tue, 2017-01-24 at 15:30 -0500, David Malcolm wrote: > On Tue, 2017-01-24 at 13:52 +0100, Martin Jambor wrote: > > Hi, > > > > On Mon, Jan 23, 2017 at 02:11:37PM +0100, Richard Biener wrote: > > > On Mon, Jan 23, 2017 at 1:02 PM, Martin Jambor> > > wrote: > > > > Hi, > > > > > > > > > > > > On Mon, Jan 23, 2017 at 12:56:13PM +0100, Richard Biener wrote: > > > > > On Fri, Jan 20, 2017 at 6:25 PM, Pekka Jääskeläinen < > > > > > pe...@parmance.com> wrote: > > > > > > Hi Richard, > > > > > > > > > > > > On Fri, Jan 20, 2017 at 10:26 AM, Richard Biener > > > > > > wrote: > > > > > > > So the #ifdef ENABLE_BRIG_FE shouldn't be needed anymore > > > > > > > (nor the > > > > > > > configury for it). > > > > > > > > > > > > > > Otherwise this looks ok to me then. > > > > > > > > > > > > Attached is a patch set with that unnecessary > > > > > > ENABLE_BRIG_FE > > > > > > macro > > > > > > removed. It has also been refreshed to the latest trunk. > > > > > > > > > > Are you set up to commit these to trunk yourself or do you > > > > > need > > > > > help here? > > > > > > > > > > > > > Pekka is still in the process of requesting an account at > > > > gcc.gnu.org. > > > > We agreed that I would commit the patches. > > > > > > > > Over Saturday and today morning, I have bootstrappet and tested > > > > them > > > > in various configurations (although mostly on x86_64 and only a > > > > little > > > > bit on aarch64) and (together with a "svn mv > > > > libgomp/plugin/hsa.h > > > > include/") I have not found any issue so far. > > > > > > > > I suppose I should go ahead (in any case, if there is any > > > > fallout, > > > > reach out to me too, at least until Pekka gets his account). > > > > > > Yes, sounds good to me. > > > > > > > It took me more time because I got distracted by PR 79198 and did > > not > > want to commit another huge thing while bootstrap was still broken > > for > > so many people with my previous patch. Nevertheless, I have just > > committed the BRIG FE as revision 244867. > > > > I tried to be careful (as well as quick to avoid conflicts) so I > > hope > > there are no issues. However if there are some, throw complaints > > my > > way at least unless Pekka gets an account at gcc.gnu.org. > > > > Thanks, > > A deps issue for the docs I noticed when glancing through the commit: > > diff --git a/gcc/brig/Make-lang.in b/gcc/brig/Make-lang.in > new file mode 100644 (file) > index 000..b85b1b0 > --- /dev/null > +++ b/gcc/brig/Make-lang.in > > [...snip...] > > +# Documentation. > + > +GO_TEXI_FILES = \ > + brig/gccbrig.texi \ > + $(gcc_docdir)/include/fdl.texi \ > + $(gcc_docdir)/include/gpl_v3.texi \ > + $(gcc_docdir)/include/gcc-common.texi \ > + gcc-vers.texi > > Presumably this should be BRIG_TEXI_FILES, rather than GO_TEXI_FILES? > > +# doc/gccbrig.info: $(BRIG_TEXI_FILES) > +# if test "x$(BUILD_INFO)" = xinfo; then \ > +#rm -f doc/gccbrig.info*; \ > +#$(MAKEINFO) $(MAKEINFOFLAGS) -I $(gcc_docdir) \ > +# -I $(gcc_docdir)/include -o $@ $<; \ > +# else true; fi > + > +# doc/gccbrig.dvi: $(BRIG_TEXI_FILES) > +# $(TEXI2DVI) -I $(abs_docdir) -I $(abs_docdir)/include -o $@ > $< > + > +# doc/gccbrig.pdf: $(BRIG_TEXI_FILES) > +# $(TEXI2PDF) -I $(abs_docdir) -I $(abs_docdir)/include -o $@ > $< > + > +$(build_htmldir)/brig/index.html: $(BRIG_TEXI_FILES) > + $(mkinstalldirs) $(@D) > + rm -f $(@D)/* > + $(TEXI2HTML) -I $(gcc_docdir) -I $(gcc_docdir)/include \ > + -I $(srcdir)/brig -o $(@D) $< > > ...for use in describing the deps of the above. > > Dave Also: grepping for "Copyright" in the commit: the new files mostly say: + Copyright (C) 2016 Free Software Foundation, Inc. but I see at least one: + Copyright (C) 2015-2016 Free Software Foundation, Inc. and some others. Should the copyright be updated to cover 2017?
Re: [PATCH] BRIG frontend: request for a global review
On Tue, 2017-01-24 at 13:52 +0100, Martin Jambor wrote: > Hi, > > On Mon, Jan 23, 2017 at 02:11:37PM +0100, Richard Biener wrote: > > On Mon, Jan 23, 2017 at 1:02 PM, Martin Jambor> > wrote: > > > Hi, > > > > > > > > > On Mon, Jan 23, 2017 at 12:56:13PM +0100, Richard Biener wrote: > > > > On Fri, Jan 20, 2017 at 6:25 PM, Pekka Jääskeläinen < > > > > pe...@parmance.com> wrote: > > > > > Hi Richard, > > > > > > > > > > On Fri, Jan 20, 2017 at 10:26 AM, Richard Biener > > > > > wrote: > > > > > > So the #ifdef ENABLE_BRIG_FE shouldn't be needed anymore > > > > > > (nor the > > > > > > configury for it). > > > > > > > > > > > > Otherwise this looks ok to me then. > > > > > > > > > > Attached is a patch set with that unnecessary ENABLE_BRIG_FE > > > > > macro > > > > > removed. It has also been refreshed to the latest trunk. > > > > > > > > Are you set up to commit these to trunk yourself or do you need > > > > help here? > > > > > > > > > > Pekka is still in the process of requesting an account at > > > gcc.gnu.org. > > > We agreed that I would commit the patches. > > > > > > Over Saturday and today morning, I have bootstrappet and tested > > > them > > > in various configurations (although mostly on x86_64 and only a > > > little > > > bit on aarch64) and (together with a "svn mv libgomp/plugin/hsa.h > > > include/") I have not found any issue so far. > > > > > > I suppose I should go ahead (in any case, if there is any > > > fallout, > > > reach out to me too, at least until Pekka gets his account). > > > > Yes, sounds good to me. > > > > It took me more time because I got distracted by PR 79198 and did not > want to commit another huge thing while bootstrap was still broken > for > so many people with my previous patch. Nevertheless, I have just > committed the BRIG FE as revision 244867. > > I tried to be careful (as well as quick to avoid conflicts) so I hope > there are no issues. However if there are some, throw complaints my > way at least unless Pekka gets an account at gcc.gnu.org. > > Thanks, A deps issue for the docs I noticed when glancing through the commit: diff --git a/gcc/brig/Make-lang.in b/gcc/brig/Make-lang.in new file mode 100644 (file) index 000..b85b1b0 --- /dev/null +++ b/gcc/brig/Make-lang.in [...snip...] +# Documentation. + +GO_TEXI_FILES = \ + brig/gccbrig.texi \ + $(gcc_docdir)/include/fdl.texi \ + $(gcc_docdir)/include/gpl_v3.texi \ + $(gcc_docdir)/include/gcc-common.texi \ + gcc-vers.texi Presumably this should be BRIG_TEXI_FILES, rather than GO_TEXI_FILES? +# doc/gccbrig.info: $(BRIG_TEXI_FILES) +# if test "x$(BUILD_INFO)" = xinfo; then \ +#rm -f doc/gccbrig.info*; \ +#$(MAKEINFO) $(MAKEINFOFLAGS) -I $(gcc_docdir) \ +# -I $(gcc_docdir)/include -o $@ $<; \ +# else true; fi + +# doc/gccbrig.dvi: $(BRIG_TEXI_FILES) +# $(TEXI2DVI) -I $(abs_docdir) -I $(abs_docdir)/include -o $@ $< + +# doc/gccbrig.pdf: $(BRIG_TEXI_FILES) +# $(TEXI2PDF) -I $(abs_docdir) -I $(abs_docdir)/include -o $@ $< + +$(build_htmldir)/brig/index.html: $(BRIG_TEXI_FILES) + $(mkinstalldirs) $(@D) + rm -f $(@D)/* + $(TEXI2HTML) -I $(gcc_docdir) -I $(gcc_docdir)/include \ + -I $(srcdir)/brig -o $(@D) $< ...for use in describing the deps of the above. Dave
Re: [PATCH 3/6] RISC-V Port: libgcc
On Fri, Jan 20, 2017 at 10:53 PM, Richard Hendersonwrote: > On 01/11/2017 06:30 PM, Palmer Dabbelt wrote: >> >> +__riscv_save_12: >> + addi sp, sp, -112 >> + li t1, 0 >> + sd s11, 8(sp) >> + j .Ls10 >> + >> +__riscv_save_11: >> +__riscv_save_10: >> + addi sp, sp, -112 >> + li t1, -16 > > > No unwind info? Next patch set will have unwind info. > > > r~
[wwwdocs] changes.html - document new warning options
Attached is an update documenting a number of options new or updated in GCC 7. Aldy, if/when you have a minute can you please quickly look over the -Waloca text to make sure I didn't miss something? -Walloca -Walloca-larger-than -Walloc-size-larger-than -Walloc-zero -Wformat-overflow -Wformat-truncation -Wnonnull -Wstringop-overflow Thanks Martin Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v retrieving revision 1.39 diff -u -r1.39 changes.html --- changes.html 17 Jan 2017 21:26:31 - 1.39 +++ changes.html 24 Jan 2017 19:57:27 - @@ -40,11 +40,15 @@ General Optimizer Improvements - New Languages and Language specific improvements @@ -186,6 +190,125 @@ enum operation { add, count }; ^ +GCC 7 contains a number of enhancements that help detect buffer overflow + and other forms of invalid memory accesses. + +The -Walloc-size-larger-than=size option + detects calls to standard and user-defined memory allocation + functions decorated with attribute alloc_size whose + argument exceeds the specified size + (PTRDIFF_MAX by default). The option also detects + arithmetic overflow in the computation of the size in two-argument + allocation functions like calloc where the total size + is the product of the two arguments. Since calls with an excessive + size cannot succeed they are typically the result of programing + errors. Such bugs have been known to be the source of + vulnerabilities and a target of exploits. + -Walloc-size-larger-than=PTRDIFF_MAX is included + in -Wall. + For example, the following call to malloc incorrectly tries + to avoid passing a negative argument to the function and instead ends + up unconditionally invoking it with an argument less than or equal + to zero. Since after conversion to the type of the argument of the + function (size_t) a negative argument results in a value + in excess of the maximum PTRDIFF_MAX the call is diagnosed. + +void* f (int n) +{ + return malloc (n > 0 ? 0 : n); +} + +warning: argument 1 range [2147483648, 4294967295] exceeds maximum object size 2147483647 [-Walloc-size-larger-than=] +The -Walloc-zero option detects calls to standard + and user-defined memory allocation functions decorated with attribute + alloc_size with a zero argument. -Walloc-zero + is not included in either -Wall or -Wextra + and must be explicitly enabled. +The -Walloca option detects all calls to the + alloca function in the program. -Walloca + is not included in either -Wall or -Wextra + and must be explicitly enabled. +The -Walloca-larger-than=size option detects + calls to the alloca function whose argument may exceed + the specified size. + -Walloca-larger-than is not included in either + -Wall or -Wextra and must be explicitly + enabled. + For example, compiling the following snippet with + -Walloca-larger-than=1024 results in a warning because + even though the code appears to call alloca only with + sizes of 1kb and less, since n is signed, a negative + value would result in a call to the function well in excess of + the limit. + +void f (int n) +{ + char *d; + if (n < 1025) +d = alloca (n); + else +d = malloc (n); + +} + +warning: argument to 'alloca may be too large due to conversion from 'int' to 'long unsigned int' [-Walloca-larger-than=] +The -Wformat-overflow=level option detects certain + and likely buffer overflow in calls to the sprintf family + of formatted output functions. Although the option is enabled even + without optimization it works best with -O2 and higher. + For example, in the following snippet the call to + sprintf is diagnosed because even though its + output has been constrained using the modulo operation it could + result in as many as three bytes if mday were negative. + The solution is to either allocate a larger buffer or make sure + the argument is not negative, for example by changing mday's + type to unsigned or by making the type of the second operand + of the modulo expression to unsigned: 100U. + +void* f (int mday) +{ + char *buf = malloc (3); + sprintf (buf, "%02i", mday % 100); + return buf; +} + +warning: 'sprintf may write a terminating nul past the end of the destination [-Wformat-overflow=] +note: 'sprintf' output between 3 and 4 bytes into a destination of size 3 +The -Wformat-truncation=level option detects + certain and likely output truncation in calls to the + snprintf family of formatted output functions. + -Wformat-overflow=1 is included in + -Wall and enabled without optimization but works best + with -O2 and higher. +The -Wnonnull option has been enhanced to detect + a broader set of cases of passing null pointers to functions that + expect a non-null argument (those decorated with attribute + nonnull). By taking
Re: [libcc1] add support for C++
On Mon, Jan 23, 2017 at 7:21 PM, Alexandre Olivawrote: > On Jan 23, 2017, Jason Merrill wrote: > > + If the newly-created namespace is to be an inline namespace, after > + push_namespace, get the nested namespace decl with > + get_current_binding_level, pop back to the enclosing namespace, > + call using_namespace with INLINE_P, and then push to the inline > + namespace again. */ >>> This seems like unnecessary contortions to expect of the caller (i.e. GDB). Let's add a new front end function to handle this. >>> >>> Heh, it's really nothing compared with some other contortions required >>> to navigate binding levels, e.g. to reenter a function scope. >>> >>> I didn't want to add redundant functionality to the libcc1 API, and we'd >>> need the ability to separate the introduction or reentry in a namespace, >>> from a using directive with attribute strong. And since inline >>> namespaces use the same implementation as that, I figured it made sense >>> to use the same interface for both. > >>> Besides, push_namespace is called a lot more often than using_namespace, >>> so it makes sense to keep it simpler, and leave the extra parameter for >>> the less common operation. >>> >>> Another argument to keep things as they are is that the inline-ness of a >>> namespace is not a property related with entering the namespace, but >>> rather with how names from it are brought into the other namespace. > >> Well, it's a property of the namespace. > > Even in the case of 'strong' using directives? I didn't think so. 'strong' using directives should really go away, they were a brief stop on the way to inline namespaces. We really shouldn't introduce more functionality based on them. I'll add a deprecated warning now... > +/* Pop the namespace last entered with push_namespace, or class last > + entered with push_class, or function last entered with > + push_function, restoring the binding level in effect before the > + matching push_* call. */ > + > +GCC_METHOD0 (int /* bool */, pop_namespace) >>> This should have a different name, perhaps pop_last_entered_scope? >>> >>> It was introduced very early on, long before the need for exposing >>> function scopes was realized and implemented. GDB (branch) already had >>> plenty of code using this primitive, so I didn't want to rename it, but >>> if you insist in such spelling changes, I won't mind, and I guess GDB >>> folks won't even. It's not too late yet ;-) > >> Please. > > How about just pop_binding_level, to align it with the nomenclature used > in the comments? (I'd have suggested just pop_scope otherwise) Sure. > + The TARGET decl names the qualifying scope (foo:: above) and the > + identifier (bar), but that does not mean that only TARGET will be > + brought into the current scope: all bindings of TARGET's identifier > + in the qualifying scope will be brought in. >>> This seems wrong; for namespace-scope using-declarations, only the declarations in scope at the point of the using-declaration are imported. Since DWARF represents each imported declaration individually, I would think that we would want the plugin to import them one at a time. >>> >>> What you say is true, but GCC doesn't offer functionality for that >>> AFAICT, and the caller can always arrange for only the bindings that >>> should be brought in by the using declarations to be defined at the time >>> the using declaration is introduced, so I left it at that so as to >>> minimize the changes to GCC proper. > >> Fair enough, but then why pass a decl? Passing scope and identifier >> would seem cleaner if that's what you mean. > > Oh, hysterical raisins IIRC. 'scope' was not so easy to pass. The API > makes gcc_decl and gcc_type distinct, unrelated types and, although > clients would get gcc_decls for functions and (when needed) namespaces, > they'd only get gcc_types for classes. Later on, start_class_definition > was split so as to make forward declarations possible, and clients now > necessarily get a gcc_decl for a class. (IIRC there weren't > get_current_binding_level or type_decl either at that point) > > Now that the decl for a class is easily available, it makes some sense > ot change the API to take a decl to specify the scope, but then a string > might not be enough to name a member: operators are kind of a separate > namespace, and we'd have to add a flag or some other means to denote > them. It can be done, but it feels like a lot of pain for the debatable > benefit of introducing additional possibilities of erroneous uses, such > as passing invalid contexts and whatnot (besides the minor inconvenience > of having to convert classes from decls to types, as DECL_CONTEXT > wants). > > OTOH, passing a context explicitly, even if optional, and using a decl > to convey name (sidestepping the operator name encoding
Re: [PR c++/78469] default ctor makes dtor required
OK. On Tue, Jan 24, 2017 at 2:17 PM, Nathan Sidwellwrote: > On 01/24/2017 11:36 AM, Jason Merrill wrote: > >> This is the wrong place for this; we don't know at this point whether >> we're in a new-expression or actually creating a temporary. I think we >> want to add this flag in the call to build_value_init from build_new_1. >> And look at other calls to build_value_init to see if they want it as >> well. > > > Oh yeah, need to set the flag further up the all chain. I couldn't see > other build_value_init calls that needed it set. > > nathan > > -- > Nathan Sidwell
Re: [C++ PATCH] 79118 bitfields & constexpr
OK. On Tue, Jan 24, 2017 at 1:53 PM, Nathan Sidwellwrote: > On 01/24/2017 01:41 PM, Jason Merrill wrote: >> >> On Tue, Jan 24, 2017 at 1:31 PM, Nathan Sidwell wrote: >>> >>> On 01/23/2017 04:06 PM, Jason Merrill wrote: >>> > In this particular case we've also made the base object the containing > class, not the unnamed struct member. That means we're looking in the > wrong > CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true. Whereas for > the > subobj's CONSTRUCTOR it is false. Why is it false? >>> >>> >>> >>> I thought it was because we're looking at a different level of ctor, >>> investigation shows there may be a bug there too. because in one place we >>> do: >>> if (*valp == NULL_TREE) >>> { >>> *valp = build_constructor (type, NULL); >>> CONSTRUCTOR_NO_IMPLICIT_ZERO (*valp) = no_zero_init; >>> and another we do: >>> if (vec_safe_is_empty (*vec) >>> || (*vec)->last().index != field) >>> { >>> ctor = build_constructor (TREE_TYPE (field), NULL); >>> CONSTRUCTOR_APPEND_ELT (*vec, field, ctor); >>> >>> However, further looking at this problem, I discovered we're not properly >>> checking the initialization of anonymous members. Once we do that, we >>> reject the ctor as a constexpr, because it fails to initialize the 'type' >>> member (regardless of bitfieldness). >>> >>> This patch augments cx_check_missing_mem_inits. I change the first parm >>> to >>> be the CTYPE not the FUN from whence we pull the CTYPE. That way we >>> don't >>> have to cons up an empty CONSTRUCTOR for the recursive case of >>> discovering >>> no initializer at all for the anon member. >>> >>> With this in place we don't try and evaluate the constexpr in the >>> original >>> testcase. >>> >>> ok? >> >> >> I'm not seeing the patch. > > > d'oh! > > > > -- > Nathan Sidwell
Re: [PR c++/78469] default ctor makes dtor required
On 01/24/2017 11:36 AM, Jason Merrill wrote: This is the wrong place for this; we don't know at this point whether we're in a new-expression or actually creating a temporary. I think we want to add this flag in the call to build_value_init from build_new_1. And look at other calls to build_value_init to see if they want it as well. Oh yeah, need to set the flag further up the all chain. I couldn't see other build_value_init calls that needed it set. nathan -- Nathan Sidwell 2017-01-24 Nathan SidwellPR c++/78469 - defaulted ctor and inaccessible dtor * cp-tree.h (tsubst_flags): Add tf_no_cleanup. * init.c (build_new_1): Pass tf_no_cleanup to build_value_init. * tree.c (build_target_expr): Check tf_no_cleanup. PR c++/78469 * g++.dg/cpp0x/pr78469.C: New. Index: cp/cp-tree.h === --- cp/cp-tree.h (revision 244880) +++ cp/cp-tree.h (working copy) @@ -4786,6 +4786,8 @@ enum tsubst_flags { substitution in fn_type_unification. */ tf_fndecl_type = 1 << 9, /* Substituting the type of a function declaration. */ + tf_no_cleanup = 1 << 10, /* Do not build a cleanup +(build_target_expr and friends) */ /* Convenient substitution flags combinations. */ tf_warning_or_error = tf_warning | tf_error }; Index: cp/init.c === --- cp/init.c (revision 244880) +++ cp/init.c (working copy) @@ -3262,8 +3262,10 @@ build_new_1 (vec **placemen } else if (explicit_value_init_p) { - /* Something like `new int()'. */ - tree val = build_value_init (type, complain); + /* Something like `new int()'. NO_CLEANUP is needed so + we don't try and build a (possibly ill-formed) + destructor. */ + tree val = build_value_init (type, complain | tf_no_cleanup); if (val == error_mark_node) return error_mark_node; init_expr = build2 (INIT_EXPR, type, init_expr, val); Index: cp/tree.c === --- cp/tree.c (revision 244880) +++ cp/tree.c (working copy) @@ -404,9 +404,15 @@ build_target_expr (tree decl, tree value || useless_type_conversion_p (TREE_TYPE (decl), TREE_TYPE (value))); - t = cxx_maybe_build_cleanup (decl, complain); - if (t == error_mark_node) -return error_mark_node; + if (complain & tf_no_cleanup) +/* The caller is building a new-expr and does not need a cleanup. */ +t = NULL_TREE; + else +{ + t = cxx_maybe_build_cleanup (decl, complain); + if (t == error_mark_node) + return error_mark_node; +} t = build4 (TARGET_EXPR, type, decl, value, t, NULL_TREE); if (EXPR_HAS_LOCATION (value)) SET_EXPR_LOCATION (t, EXPR_LOCATION (value)); Index: testsuite/g++.dg/cpp0x/pr78469.C === --- testsuite/g++.dg/cpp0x/pr78469.C (revision 0) +++ testsuite/g++.dg/cpp0x/pr78469.C (working copy) @@ -0,0 +1,14 @@ +// { dg-do compile { target c++11 } } +// PR78469, bogus error about inaccessible dtor. + +struct no_destr { + no_destr() = default; + +protected: + ~no_destr() = default; +}; + +void *Foo () +{ + return new no_destr (); +}
Re: [C++ PATCH] 79118 bitfields & constexpr
On 01/24/2017 01:41 PM, Jason Merrill wrote: On Tue, Jan 24, 2017 at 1:31 PM, Nathan Sidwellwrote: On 01/23/2017 04:06 PM, Jason Merrill wrote: In this particular case we've also made the base object the containing class, not the unnamed struct member. That means we're looking in the wrong CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true. Whereas for the subobj's CONSTRUCTOR it is false. Why is it false? I thought it was because we're looking at a different level of ctor, investigation shows there may be a bug there too. because in one place we do: if (*valp == NULL_TREE) { *valp = build_constructor (type, NULL); CONSTRUCTOR_NO_IMPLICIT_ZERO (*valp) = no_zero_init; and another we do: if (vec_safe_is_empty (*vec) || (*vec)->last().index != field) { ctor = build_constructor (TREE_TYPE (field), NULL); CONSTRUCTOR_APPEND_ELT (*vec, field, ctor); However, further looking at this problem, I discovered we're not properly checking the initialization of anonymous members. Once we do that, we reject the ctor as a constexpr, because it fails to initialize the 'type' member (regardless of bitfieldness). This patch augments cx_check_missing_mem_inits. I change the first parm to be the CTYPE not the FUN from whence we pull the CTYPE. That way we don't have to cons up an empty CONSTRUCTOR for the recursive case of discovering no initializer at all for the anon member. With this in place we don't try and evaluate the constexpr in the original testcase. ok? I'm not seeing the patch. d'oh! -- Nathan Sidwell 2017-01-24 Nathan Sidwell PR c++/79118 - anon-members and constexpr * constexpr.c (cx_check_missing_mem_inits): Caller passes type not ctor decl. Recursively check anonymous members. (register_constexpr_fundef): Adjust cx_check_missing_mem_inits call. (explain_invalid_constexpr_fn): Likewise. PR c++/79118 * g++.dg/cpp0x/pr79118.C: New. Index: cp/constexpr.c === --- cp/constexpr.c (revision 244874) +++ cp/constexpr.c (working copy) @@ -696,23 +696,21 @@ massage_constexpr_body (tree fun, tree b return body; } -/* FUN is a constexpr constructor with massaged body BODY. Return true - if some bases/fields are uninitialized, and complain if COMPLAIN. */ +/* CTYPE is a type constructed from BODY. Return true if some + bases/fields are uninitialized, and complain if COMPLAIN. */ static bool -cx_check_missing_mem_inits (tree fun, tree body, bool complain) +cx_check_missing_mem_inits (tree ctype, tree body, bool complain) { - bool bad; - tree field; - unsigned i, nelts; - tree ctype; - - if (TREE_CODE (body) != CONSTRUCTOR) -return false; - - nelts = CONSTRUCTOR_NELTS (body); - ctype = DECL_CONTEXT (fun); - field = TYPE_FIELDS (ctype); + unsigned nelts = 0; + + if (body) +{ + if (TREE_CODE (body) != CONSTRUCTOR) + return false; + nelts = CONSTRUCTOR_NELTS (body); +} + tree field = TYPE_FIELDS (ctype); if (TREE_CODE (ctype) == UNION_TYPE) { @@ -726,13 +724,13 @@ cx_check_missing_mem_inits (tree fun, tr return false; } - bad = false; - for (i = 0; i <= nelts; ++i) + /* Iterate over the CONSTRUCTOR, checking any missing fields don't + need an explicit initialization. */ + bool bad = false; + for (unsigned i = 0; i <= nelts; ++i) { - tree index; - if (i == nelts) - index = NULL_TREE; - else + tree index = NULL_TREE; + if (i < nelts) { index = CONSTRUCTOR_ELT (body, i)->index; /* Skip base and vtable inits. */ @@ -740,13 +738,25 @@ cx_check_missing_mem_inits (tree fun, tr || DECL_ARTIFICIAL (index)) continue; } + for (; field != index; field = DECL_CHAIN (field)) { tree ftype; - if (TREE_CODE (field) != FIELD_DECL - || (DECL_C_BIT_FIELD (field) && !DECL_NAME (field)) - || DECL_ARTIFICIAL (field)) + if (TREE_CODE (field) != FIELD_DECL) continue; + if (DECL_C_BIT_FIELD (field) && !DECL_NAME (field)) + continue; + if (DECL_ARTIFICIAL (field)) + continue; + if (ANON_AGGR_TYPE_P (TREE_TYPE (field))) + { + /* Recurse to check the anonummous aggregate member. */ + bad |= cx_check_missing_mem_inits + (TREE_TYPE (field), NULL_TREE, complain); + if (bad && !complain) + return true; + continue; + } ftype = strip_array_types (TREE_TYPE (field)); if (type_has_constexpr_default_constructor (ftype)) { @@ -766,6 +776,15 @@ cx_check_missing_mem_inits (tree fun, tr } if (field == NULL_TREE) break; + + if (ANON_AGGR_TYPE_P (TREE_TYPE (index))) + { + /* Check the anonymous aggregate initializer is valid. */ + bad |= cx_check_missing_mem_inits + (TREE_TYPE (index), CONSTRUCTOR_ELT (body, i)->value, complain); + if (bad && !complain) + return true; + } field =
Re: C++ Modules branch
Woo! On Tue, Jan 24, 2017 at 8:05 AM, Nathan Sidwellwrote: > As some have already noticed, I created a c++-modules branch yesterday. > Don't get too excited, that doesn't mean I have an implementation to commit > there. > > I expect several false starts, and don't intend to post patches here. > > Wiki page is https://gcc.gnu.org/wiki/cxx-modules > > nathan > -- > Nathan Sidwell
Re: [C++ PATCH] 79118 bitfields & constexpr
On Tue, Jan 24, 2017 at 1:31 PM, Nathan Sidwellwrote: > On 01/23/2017 04:06 PM, Jason Merrill wrote: > >>> In this particular case we've also made the base object the containing >>> class, not the unnamed struct member. That means we're looking in the >>> wrong >>> CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true. Whereas for >>> the >>> subobj's CONSTRUCTOR it is false. >> >> >> Why is it false? > > > I thought it was because we're looking at a different level of ctor, > investigation shows there may be a bug there too. because in one place we > do: > if (*valp == NULL_TREE) > { > *valp = build_constructor (type, NULL); > CONSTRUCTOR_NO_IMPLICIT_ZERO (*valp) = no_zero_init; > and another we do: > if (vec_safe_is_empty (*vec) > || (*vec)->last().index != field) > { > ctor = build_constructor (TREE_TYPE (field), NULL); > CONSTRUCTOR_APPEND_ELT (*vec, field, ctor); > > However, further looking at this problem, I discovered we're not properly > checking the initialization of anonymous members. Once we do that, we > reject the ctor as a constexpr, because it fails to initialize the 'type' > member (regardless of bitfieldness). > > This patch augments cx_check_missing_mem_inits. I change the first parm to > be the CTYPE not the FUN from whence we pull the CTYPE. That way we don't > have to cons up an empty CONSTRUCTOR for the recursive case of discovering > no initializer at all for the anon member. > > With this in place we don't try and evaluate the constexpr in the original > testcase. > > ok? I'm not seeing the patch. Jason
[PATCH] PR libstdc++/79190 add fallback aligned_alloc implementation
This provides a definition of aligned_alloc for targets which have none of aligned_alloc, posix_memalign, memalign or _aligned_alloc. The code is based on gcc/config/i386/gmm_malloc.h but modified to only use sizeof(void*) not 2*sizeof(void*), because I don't understand why that's used in gmm_malloc.h, or what the comment means. libstdc++-v3: PR libstdc++/79190 * libsupc++/del_opa.cc (operator delete(void*, std::align_val_t)) [!_GLIBCXX_HAVE_ALIGNED_ALLOC && !_GLIBCXX_HAVE_POSIX_MEMALIGN && !_GLIBCXX_HAVE_MEMALIGN && !_GLIBCXX_HAVE__ALIGNED_MALLOC]: Retrieve original pointer value allocated by malloc. * libsupc++/new_opa.cc [!_GLIBCXX_HAVE_ALIGNED_ALLOC && !_GLIBCXX_HAVE_POSIX_MEMALIGN && !_GLIBCXX_HAVE_MEMALIGN && !_GLIBCXX_HAVE__ALIGNED_MALLOC] (aligned_alloc(size_t, size_t)): Define, adjusting pointer value allocated by malloc and storing for retrieval by operator delete. gcc/testsuite: PR libstdc++/79190 * g++.dg/cpp1z/aligned-new3.C: Replace operator new so behaviour matches replaced operator delete. Tested powerpc64le-linux, and also tested after tweaking the preprocessor checks so the new code was used. If this works for HPUX I plan to commit it to trunk. commit 2f17c7769b6917b6ab6884c87c4496e32d4cc057 Author: Jonathan WakelyDate: Tue Jan 24 16:17:15 2017 + PR libstdc++/79190 add fallback aligned_alloc implementation libstdc++-v3: PR libstdc++/79190 * libsupc++/del_opa.cc (operator delete(void*, std::align_val_t)) [!_GLIBCXX_HAVE_ALIGNED_ALLOC && !_GLIBCXX_HAVE_POSIX_MEMALIGN && !_GLIBCXX_HAVE_MEMALIGN && !_GLIBCXX_HAVE__ALIGNED_MALLOC]: Retrieve original pointer value allocated by malloc. * libsupc++/new_opa.cc [!_GLIBCXX_HAVE_ALIGNED_ALLOC && !_GLIBCXX_HAVE_POSIX_MEMALIGN && !_GLIBCXX_HAVE_MEMALIGN && !_GLIBCXX_HAVE__ALIGNED_MALLOC] (aligned_alloc(size_t, size_t)): Define, adjusting pointer value allocated by malloc and storing for retrieval by operator delete. gcc/testsuite: PR libstdc++/79190 * g++.dg/cpp1z/aligned-new3.C: Replace operator new so behaviour matches replaced operator delete. diff --git a/gcc/testsuite/g++.dg/cpp1z/aligned-new3.C b/gcc/testsuite/g++.dg/cpp1z/aligned-new3.C index 73e3343..e50e62c 100644 --- a/gcc/testsuite/g++.dg/cpp1z/aligned-new3.C +++ b/gcc/testsuite/g++.dg/cpp1z/aligned-new3.C @@ -7,6 +7,11 @@ struct alignas(64) A { int i; }; +void* operator new (std::size_t n, std::align_val_t) +{ + return operator new (n); +} + bool deleted = false; void operator delete (void *p, std::size_t, std::align_val_t) { diff --git a/libstdc++-v3/libsupc++/del_opa.cc b/libstdc++-v3/libsupc++/del_opa.cc index c9afb46..f7bf7a4 100644 --- a/libstdc++-v3/libsupc++/del_opa.cc +++ b/libstdc++-v3/libsupc++/del_opa.cc @@ -46,9 +46,13 @@ _GLIBCXX_END_NAMESPACE_VERSION _GLIBCXX_WEAK_DEFINITION void operator delete(void* ptr, std::align_val_t) _GLIBCXX_USE_NOEXCEPT { -#if !_GLIBCXX_HAVE_ALIGNED_ALLOC && _GLIBCXX_HAVE__ALIGNED_MALLOC +#if _GLIBCXX_HAVE_ALIGNED_ALLOC || _GLIBCXX_HAVE_POSIX_MEMALIGN \ +|| _GLIBCXX_HAVE_MEMALIGN + std::free (ptr); +#elif _GLIBCXX_HAVE__ALIGNED_MALLOC _aligned_free (ptr); #else - std::free(ptr); + if (ptr) +std::free (((void **) ptr)[-1]); // See aligned_alloc in new_opa.cc #endif } diff --git a/libstdc++-v3/libsupc++/new_opa.cc b/libstdc++-v3/libsupc++/new_opa.cc index 0bc2b5f..5d6c185 100644 --- a/libstdc++-v3/libsupc++/new_opa.cc +++ b/libstdc++-v3/libsupc++/new_opa.cc @@ -55,9 +55,30 @@ extern "C" void *memalign(std::size_t boundary, std::size_t size); #endif #define aligned_alloc memalign #else -// The C library doesn't provide any aligned allocation functions, declare -// aligned_alloc and get a link failure if aligned new is used. -extern "C" void *aligned_alloc(std::size_t, std::size_t); +// This is a modified version of code from gcc/config/i386/gmm_malloc.h +static inline void* +aligned_alloc (std::size_t al, std::size_t sz) +{ + // Alignment must be a power of two. + if (al & (al - 1)) +return nullptr; + else if (!sz) +return nullptr; + + // We need extra bytes to store the original value returned by malloc. + if (al < sizeof(void*)) +al = sizeof(void*); + void* const malloc_ptr = malloc(sz + al); + if (!malloc_ptr) +return nullptr; + // Align to the requested value, leaving room for the original malloc value. + void* const aligned_ptr = (void *) (((size_t) malloc_ptr + al) & -al); + + // Store the original malloc value where it can be found by operator delete. + ((void **) aligned_ptr)[-1] = malloc_ptr; + + return aligned_ptr; +} #endif #endif
Re: [PATCH 1/4] BRIG (HSAIL) frontend: configuration file changes and misc
Hi, On Tue, Jan 24, 2017 at 7:26 PM, Joseph Myerswrote: > Since front ends can't be enabled or disabled on a per-multilib basis, if > you want to support any x86/x86_64 GNU/Linux configuration you need to > support all of them here (and possibly disable runtime libraries for > particular multilibs). That is, this should be *more* general and also > allow i[[3456789]]86-*-linux* because that can have a -m64 multilib. OK. Patch attached. Does it look right now? BR, Pekka diff --git a/configure b/configure index d757369..b62d811 100755 --- a/configure +++ b/configure @@ -3487,6 +3487,8 @@ fi # broken systems. Currently it has been tested only on x86_64 Linux # of the upstream gcc targets. More targets shall be added after testing. case "${target}" in + i[3456789]86-*-linux*) +;; x86_64-*-linux*) ;; *) diff --git a/configure.ac b/configure.ac index 5818332..2df8e34 100644 --- a/configure.ac +++ b/configure.ac @@ -817,6 +817,8 @@ fi # broken systems. Currently it has been tested only on x86_64 Linux # of the upstream gcc targets. More targets shall be added after testing. case "${target}" in + i[[3456789]]86-*-linux*) +;; x86_64-*-linux*) ;; *)
Re: [C++ PATCH] 79118 bitfields & constexpr
On 01/23/2017 04:06 PM, Jason Merrill wrote: In this particular case we've also made the base object the containing class, not the unnamed struct member. That means we're looking in the wrong CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true. Whereas for the subobj's CONSTRUCTOR it is false. Why is it false? I thought it was because we're looking at a different level of ctor, investigation shows there may be a bug there too. because in one place we do: if (*valp == NULL_TREE) { *valp = build_constructor (type, NULL); CONSTRUCTOR_NO_IMPLICIT_ZERO (*valp) = no_zero_init; and another we do: if (vec_safe_is_empty (*vec) || (*vec)->last().index != field) { ctor = build_constructor (TREE_TYPE (field), NULL); CONSTRUCTOR_APPEND_ELT (*vec, field, ctor); However, further looking at this problem, I discovered we're not properly checking the initialization of anonymous members. Once we do that, we reject the ctor as a constexpr, because it fails to initialize the 'type' member (regardless of bitfieldness). This patch augments cx_check_missing_mem_inits. I change the first parm to be the CTYPE not the FUN from whence we pull the CTYPE. That way we don't have to cons up an empty CONSTRUCTOR for the recursive case of discovering no initializer at all for the anon member. With this in place we don't try and evaluate the constexpr in the original testcase. ok? nathan -- Nathan Sidwell
Re: [PATCH, rs6000] Fix for entries in table of overloaded built-in functions
On Tue, 2017-01-24 at 11:08 -0600, Segher Boessenkool wrote: > On Tue, Jan 24, 2017 at 08:28:37AM -0800, Carl E. Love wrote: > > The following patch fixes an issue with the entries in the table of > > built-in functions. All of the entries for a given built-in, must occur > > in the table as a single block of entries. Otherwise the code that > > searches the table for a given built-in definition will stop looking > > once it reaches the end of the initial block of definitions for that > > built-in function and subsequent definitions for that built-in will > > never be checked. This issue currently occurs with the > > ALTIVEC_BUILTIN_VEC_PACKS and P8V_BUILTIN_VEC_VGBBD built-ins. The > > patch simply moves the existing entries so the definition for a given > > built-in are all together in the same block of entries. > > Do we need a separate testcase to check for this? Or do those specific > builtins need better testcases? Or was the bug obvious already? I have a list of built-ins that need to have support and test cases added. I found the issue with the ALTIVEC_BUILTIN_VEC_PACKS when I tried to add support for the built-ins: vector signed int vec_packs (vector signed long long x, vector signed long long y); vector unsigned int vec_packs (vector unsigned long long x, vector unsigned long long y); which were in my to do list. What I found was the support for vec_packs is all there but I don't find any test cases for these built-ins. At this point, I do plan to add the vec_pack test cases as part of my work to add the support for the other built-ins on my list. I have the patch in my patch series with the others that need adding. Currently holding off on posting patches since we are only supposed to be posting bug fixes at the moment. Once the bug for the ALTIVEC_BUILTIN_VEC_PACKS built-in was found, I wrote a perl script to scan through the entire table looking for the issue with any other built-in functions. The script found the issue with the P8V_BUILTIN_VEC_VGBBD built-in. My list of built-ins to add doesn't include anything for vec_vgbbd. It would be easy for my to add the test cases for the vec_packs() built-ins to this patch if you would like? I just took a look at the vec_vgbbd() built-in. I grep'd for vgbbd and found the followint two testcases in gcc/testsuite/gcc.target/powerpc/p8vector-builtin-4.c: typedef vector signed charvc_sign; typedef vector unsigned char vc_uns; vc_sign vc_gbb_2 (vc_sign a) { return vec_vgbbd (a); } vc_uns vc_gbb_3 (vc_uns a) { return vec_vgbbd (a); } which correspond to the built-in entries in rs6000-c.c which I didn't move { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 }, { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 }, I don't see any tests for the two built-in entries in rs6000-c.c which the patch moves, i.e. { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, RS6000_BTI_V16QI, 0, 0, 0 }, { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, RS6000_BTI_unsigned_V16QI, 0, 0, 0 }, I tried a quick test of adding the following to the test file p8vector-builtin-4.c for these entries: vc_sign vc_gbb_4 (void) { return vec_vgbbd (); }
Re: [PATCH v2] aarch64: Add split-stack initial support
On 03/01/2017 13:13, Wilco Dijkstra wrote: > Adhemerval Zanella wrote: > > Sorry for the late reply - but I think it's getting there. A few more > comments: No worries. > > + /* If function uses stacked arguments save the old stack value so morestack > + can return it. */ > + reg11 = gen_rtx_REG (Pmode, R11_REGNUM); > + if (cfun->machine->frame.saved_regs_size > + || cfun->machine->frame.saved_varargs_size) > +emit_move_insn (reg11, stack_pointer_rtx); > > This doesn't look right - we could have many arguments even without varargs or > saved regs. This would need to check varargs as well as ctrl->args.size (I > believe > that is the size of the arguments on the stack). It's fine to omit this > optimization > in the first version - we already emit 2-3 extra instructions for the check > anyway. I will check for a better solution. > > > +void > +aarch64_split_stack_space_check (rtx size, rtx label) > { > + rtx mem, ssvalue, cc, cmp, jump, temp; > + rtx requested = gen_reg_rtx (Pmode); > + /* Offset from thread pointer to __private_ss. */ > + int psso = 0x10; > + > + /* Load __private_ss from TCB. */ > + ssvalue = gen_rtx_REG (Pmode, R9_REGNUM); > > ssvalue doesn't need to be a hardcoded register. Indeed, and it seems that this was not being triggered. I have fixed it in this version. > > + emit_insn (gen_aarch64_load_tp_hard (ssvalue)); > + mem = gen_rtx_MEM (Pmode, plus_constant (Pmode, ssvalue, psso)); > + emit_move_insn (ssvalue, mem); > + > + temp = gen_rtx_REG (Pmode, R10_REGNUM); > + > + /* And compare it with frame pointer plus required stack. */ > + size = force_reg (Pmode, size); > + emit_move_insn (requested, gen_rtx_MINUS (Pmode, stack_pointer_rtx, size)); > + > + /* Jump to __morestack call if current __private_ss is not suffice. */ > + cc = aarch64_gen_compare_reg (LT, temp, ssvalue); > > This uses X10, but where is it set??? I fixed it on this version. > > + cmp = gen_rtx_fmt_ee (GEU, VOIDmode, cc, const0_rtx); > + jump = emit_jump_insn (gen_condjump (cmp, cc, label)); > + JUMP_LABEL (jump) = label; > +} > > So neither X10 nor X12 are set before potentially calling __morestack, so I > don't > think it will work. Could this be causing the crash you mentioned? I do not think so, the issue in with the runtime/pprof libgo test that fails with [Switching to LWP 18926] 0x0050c358 in runtime.sigtrampgo (sig=sig@entry=27, info=info@entry=0x7fb63d5da0, ctx=ctx@entry=0x7fb63d5e20) at ../../../gcc-git/libgo/go/runtime/signal_unix.go:221 221 setg(g.m.gsignal) Where g.m is null. Trying to obtain a stackstrace I am not seeing: (gdb) bt #0 0x0050c358 in runtime.sigtrampgo (sig=sig@entry=27, info=info@entry=0x7fb63d5da0, ctx=ctx@entry=0x7fb63d5e20) at ../../../gcc-git/libgo/go/runtime/signal_unix.go:221 #1 0x0056acb4 in runtime.sigtramp (sig=27, info=0x7fb63d5da0, context=0x7fb63d5e20) at ../../../gcc-git/libgo/runtime/go-signal.c:131 #2 #3 pprof_test.cpuHog1 () at pprof_test.go:52 #4 0x0040c814 in pprof_test.cpuHogger (f=f@entry=0x57c560, dur=) at pprof_test.go:37 #5 0x0040c9f8 in pprof_test.$nested1 (dur=) at pprof_test.go:75 #6 0x0040d038 in pprof_test.testCPUProfile (t=t@entry=0x420804e680, need=..., f=f@entry=0x57c600 ) at pprof_test.go:144 #7 0x0040c9a8 in runtime_pprof_test.TestCPUProfile (t=0x420804e680) at pprof_test.go:74 #8 0x00543bec in testing.tRunner (param=, fn=) at ../../../gcc-git/libgo/go/testing/testing.go:656 #9 0x00543c84 in testing.$thunk24 (__go_thunk_parameter=) at ../../../gcc-git/libgo/go/testing/testing.go:693 #10 0x0041a7dc in kickoff () at ../../../gcc-git/libgo/runtime/proc.c:258 /build/buildd/gdb-7.9/gdb/dwarf2-frame.c:1732: internal-error: add_cie: Assertion `n < 1 || cie_table->entries[n - 1]->cie_pointer < cie->cie_pointer' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) Which maybe the case that morestack.S unwind info not really correct. It could be a case for issue in gdb as well (I will check with a newer gdb). > > Wilco > >From 09b51fb706a8b15ecaea4ec4b8a80d0a7903053d Mon Sep 17 00:00:00 2001 From: Adhemerval Zanella Date: Wed, 4 May 2016 21:13:39 + Subject: [PATCH] aarch64: Add split-stack initial support This patch adds the split-stack support on aarch64 (PR #67877). As for other ports this patch should be used along with glibc and gold support. The support is done similar to other architectures: a __private_ss field is added on TCB in glibc, a target-specific __morestack implementation and helper functions are added in libgcc and compiler supported in adjusted (split-stack prologue, va_start for argument handling). I also plan to send the gold support to adjust stack
Re: [committed][PATCH 9/9] Add "__RTL" to cc1 (v9)
On Tue, 24 Jan 2017, David Malcolm wrote: > /* { dg-do run { x86_64-*-* } } */ That's never correct, since x86_64-*-* can have -m32 multilibs and i?86-*-* can have 64-bit multilibs. You always need to cover all relevant triplets and then restrict with effective-target selectors to relevant ABIs. > +/* { dg-do compile { target aarch64-*-* } } */ Should be aarch64*-*-* to cover aarch64_be triplets (and then again restrict by ABI as necessary; any triplet might have both endiannesses as multilibs, and both LP64 and ILP32). -- Joseph S. Myers jos...@codesourcery.com
[committed][PATCH 9/9] Add "__RTL" to cc1 (v9)
On Fri, 2017-01-20 at 16:16 +0100, Richard Biener wrote: [...] > > Richi: if that patch is approved, are you OK with patch 9 in early > > stage 4? > > Yes. Thanks. All of the various parts of patch 9 have now been approved. I've rebased it, and merged the fixes identified in review. Testing on i686-pc-linux-gnu identified some tests in gcc.dg/rtl/x86_64 marked with: /* { dg-do run { target i?86-*-* x86_64-*-* } } */ that compile, but fail to run properly on i686. It seems unrealistic to expect RTL expanded for x86_64 to run successfully on i686, so I changed those lines to: /* { dg-do run { x86_64-*-* } } */ Bootstrapped on x86_64-pc-linux-gnu. Verified build of stage1 and successfully tested rtl.exp on powerpc-ibm-aix7.1.3.0 (gcc111) and on i686-pc-linux-gnu. Committed to trunk (r244878). gcc/c-family/ChangeLog: * c-common.c (c_common_reswords): Add "__RTL". * c-common.h (enum rid): Add RID_RTL. gcc/c/ChangeLog: * c-parser.c: Include "read-rtl-function.h" and "run-rtl-passes.h". (c_parser_declaration_or_fndef): Rename "gimple-pass-list" in grammar to gimple-or-rtl-pass-list. Add rtl-function-definition production. Update for renaming of field "gimple_pass" to "gimple_or_rtl_pass". If __RTL was seen, call c_parser_parse_rtl_body. Convert a timevar_push/pop pair to an auto_timevar, to cope with early exit. (c_parser_declspecs): Update RID_GIMPLE handling for renaming of field "gimple_pass" to "gimple_or_rtl_pass", and for renaming of c_parser_gimple_pass_list to c_parser_gimple_or_rtl_pass_list. Handle RID_RTL. (c_parser_parse_rtl_body): New function. * c-tree.h (enum c_declspec_word): Add cdw_rtl. (struct c_declspecs): Rename field "gimple_pass" to "gimple_or_rtl_pass". Add field "rtl_p". * gimple-parser.c (c_parser_gimple_pass_list): Rename to... (c_parser_gimple_or_rtl_pass_list): ...this, updating accordingly. * gimple-parser.h (c_parser_gimple_pass_list): Rename to... (c_parser_gimple_or_rtl_pass_list): ...this. gcc/ChangeLog: * cfg.c (original_copy_tables_initialized_p): New function. * cfg.h (original_copy_tables_initialized_p): New decl. * cfgrtl.c (relink_block_chain): Guard the call to free_original_copy_tables with a call to original_copy_tables_initialized_p. * cgraph.h (symtab_node::native_rtl_p): New decl. * cgraphunit.c (symtab_node::native_rtl_p): New function. (symtab_node::needed_p): Don't assert for early assembly output for __RTL functions. (cgraph_node::finalize_function): Set "force_output" for __RTL functions. (cgraph_node::analyze): Bail out early for __RTL functions. (analyze_functions): Update assertion to support __RTL functions. (cgraph_node::expand): Bail out early for __RTL functions. * final.c (rest_of_clean_state): Don't call delete_tree_ssa for __RTL functions. * function.h (struct function): Update comment for field "pass_startwith". * gimple-expr.c: Include "tree-pass.h". (gimple_has_body_p): Return false for __RTL functions. * Makefile.in (OBJS): Add run-rtl-passes.o. * pass_manager.h (gcc::pass_manager::get_rest_of_compilation): New accessor. (gcc::pass_manager::get_clean_slate): New accessor. * passes.c: Include "insn-addr.h". (should_skip_pass_p): Add logging. Update logic for running "expand" to be compatible with both __GIMPLE and __RTL. Guard property-provider override so it is only done for gimple passes. Don't skip dfinit. (skip_pass): New function. (execute_one_pass): Call skip_pass when skipping passes. * read-md.c (md_reader::read_char): Support filtering the input to a subset of line numbers. (md_reader::md_reader): Initialize fields m_first_line and m_last_line. (md_reader::read_file_fragment): New function. * read-md.h (md_reader::read_file_fragment): New decl. (md_reader::m_first_line): New field. (md_reader::m_last_line): New field. * read-rtl-function.c (function_reader::create_function): Only create cfun if it doesn't already exist. Set PROP_rtl on cfun's curr_properties. Set DECL_INITIAL to a dummy block. (read_rtl_function_body_from_file_range): New function. * read-rtl-function.h (read_rtl_function_body_from_file_range): New decl. * run-rtl-passes.c: New file. * run-rtl-passes.h: New file. gcc/testsuite/ChangeLog: * gcc.dg/rtl/aarch64/asr_div1.c: New test case. * gcc.dg/rtl/aarch64/pr71779.c: New test case. * gcc.dg/rtl/rtl.exp: New file. * gcc.dg/rtl/test.c: New file. * gcc.dg/rtl/truncated-rtl-file.c: New test case. *
[PATCH], PR 79212: Fix ICE when compiling fortran test with openmp
Hi, I have a patch to fix the following openmp issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79212 Writing openmp directives in a certain way in fortran programs can lead to the following assert: internal compiler error: in maybe_lookup_decl_in_outer_ctx, at omp-low.c:4134 0xa941e6 maybe_lookup_decl_in_outer_ctx /work/davshe01/oban-work-shoji/src/gcc/gcc/omp-low.c:4134 0xa9cadc scan_sharing_clauses /work/davshe01/oban-work-shoji/src/gcc/gcc/omp-low.c:1975 Tested: aarch64 - No regressions in gcc/testsuite/fortran.dg, gcc/testsuite/gcc.dg, gcc/testsuite/g++.dg or libgomp/testsuite Will do a full test run before submitting. Good to go? David Sherwood. ChangeLog: 2017-01-24 David SherwoodPR middle-end/79212 gcc/ * gimplify.c (omp_notice_variable): Add GOVD_SEEN flag to variables in all contexts. gcc/testsuite/ * gfortran.dg/gomp/sharing-4.f90: New test. pr79212.patch Description: pr79212.patch
Re: [PATCH 1/4] BRIG (HSAIL) frontend: configuration file changes and misc
On Tue, 24 Jan 2017, Matthias Klose wrote: > the toplevel configure.ac has: > > # Disable the BRIG frontend and libhsail-rt on untested or known > # broken systems. Currently it has been tested only on x86_64 Linux > # of the upstream gcc targets. More targets shall be added after testing. > case "${target}" in > x86_64-*-linux*) > > This matches the x86_64-linux-gnux32 target as well. Is this intended? Since front ends can't be enabled or disabled on a per-multilib basis, if you want to support any x86/x86_64 GNU/Linux configuration you need to support all of them here (and possibly disable runtime libraries for particular multilibs). That is, this should be *more* general and also allow i[[3456789]]86-*-linux* because that can have a -m64 multilib. -- Joseph S. Myers jos...@codesourcery.com
[PATCH TEST]Remove xfail for gcc.dg/vect/vect-24.c on ARM targets
Hi, Test gcc.dg/vect/vect-24.c starts passing after my vectorizer changes, but not on all targets. For x86_64, looks like other passes still mess up the IR and prevent it from being vectorized. This patch removes xfail for ARM targets. Test result checked. Is it OK? Thanks, bin gcc/testsuite/ChangeLog 2017-01-23 Bin Cheng* gcc.dg/vect/vect-24.c: Remove xfail on ARM targets.diff --git a/gcc/testsuite/gcc.dg/vect/vect-24.c b/gcc/testsuite/gcc.dg/vect/vect-24.c index 09a6d7e..0511f7b 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-24.c +++ b/gcc/testsuite/gcc.dg/vect/vect-24.c @@ -122,6 +122,5 @@ int main (void) return main1 (); } - -/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { xfail { { ! aarch64*-*-* } && { ! arm-*-* } } } } } */ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
Re: Improve things for PR71724, in combine/if_then_else_cond
On 01/24/2017 06:03 PM, Christophe Lyon wrote: Ha... the regression occurred between r 244818 and r 244816, and I read r 244816 ChangeLog too quickly and did not notice it was modifying ifcvt.c in addition to x86-only files. So it's likely that it's your other patch for pr78634 that caused the regression I mentioned. Does it make more sense? That's possible. That added a missing cost check, so the question becomes - is the change in generated assembly sensible, given the selected CPU type? Bernd
Re: [PATCH, rs6000] Fix for entries in table of overloaded built-in functions
On Tue, Jan 24, 2017 at 08:28:37AM -0800, Carl E. Love wrote: > The following patch fixes an issue with the entries in the table of > built-in functions. All of the entries for a given built-in, must occur > in the table as a single block of entries. Otherwise the code that > searches the table for a given built-in definition will stop looking > once it reaches the end of the initial block of definitions for that > built-in function and subsequent definitions for that built-in will > never be checked. This issue currently occurs with the > ALTIVEC_BUILTIN_VEC_PACKS and P8V_BUILTIN_VEC_VGBBD built-ins. The > patch simply moves the existing entries so the definition for a given > built-in are all together in the same block of entries. Do we need a separate testcase to check for this? Or do those specific builtins need better testcases? Or was the bug obvious already? > Note this issue also exists with the GCC-5 and GCC-6 branches. > > The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE) > with no regressions. > > Is the patch OK for trunk? Yes, thanks! > Assuming this patch is OK, would it be acceptable to post a back port of > the patch for GCC 5 and GCC 6 branches after the patch is in mainline as > long as no issues are seen with this version in the mainline code base? Right; let it simmer on mainline for a while, and then it is approved for backport (if it is the same patch; otherwise please post the version of the patch you checked in to the branches). Do send an email noting you backported it to which branches. Segher > 2017-01-23 Carl Love> > * config/rs6000/rs6000-c (altivec_overloaded_builtins): Fix order > of entries for ALTIVEC_BUILTIN_VEC_PACKS and P8V_BUILTIN_VEC_VGBBD.
Re: Improve things for PR71724, in combine/if_then_else_cond
On 24 January 2017 at 17:55, Bernd Schmidtwrote: > On 01/24/2017 05:50 PM, Kyrill Tkachov wrote: >> >> >> Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S >> -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get >> the test failing before and after the patch. The code generated is >> vcmp.f64d0, d1 >> vmrsAPSR_nzcv, FPSCR >> vmovvs.f64 d0, d1 >> bx lr >> >> whereas the desired (e.g. with -mcpu=cortex-a57) is: >> vcmp.f64d0, d1 >> vmrsAPSR_nzcv, FPSCR >> vselvs.f64 d0, d1, d0 >> bx lr > > > Yes, I've seen both of these generated with different options, but the patch > did not make a difference here either. > > For the moment I'll assume this was a false alarm, i.e. Christophe > misidentified the patch and something else went wrong. > Ha... the regression occurred between r 244818 and r 244816, and I read r 244816 ChangeLog too quickly and did not notice it was modifying ifcvt.c in addition to x86-only files. So it's likely that it's your other patch for pr78634 that caused the regression I mentioned. Does it make more sense? Sorry for the probably wrong identification. It always takes some time for me to reproduce regressions manually because I can only keep logs/results of validations, the toolchains actually built are deleted once validation completes. Christophe > > Bernd
Re: Improve things for PR71724, in combine/if_then_else_cond
On 01/24/2017 05:50 PM, Kyrill Tkachov wrote: Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get the test failing before and after the patch. The code generated is vcmp.f64d0, d1 vmrsAPSR_nzcv, FPSCR vmovvs.f64 d0, d1 bx lr whereas the desired (e.g. with -mcpu=cortex-a57) is: vcmp.f64d0, d1 vmrsAPSR_nzcv, FPSCR vselvs.f64 d0, d1, d0 bx lr Yes, I've seen both of these generated with different options, but the patch did not make a difference here either. For the moment I'll assume this was a false alarm, i.e. Christophe misidentified the patch and something else went wrong. Bernd
Re: Improve things for PR71724, in combine/if_then_else_cond
On 24/01/17 16:36, Bernd Schmidt wrote: On 01/24/2017 05:30 PM, Kyrill Tkachov wrote: The -mfpu is overridden in the testcase to add the ARMv8 instructions. So to reproduce the compilation in that testcase you'd want -mfpu=fp-armv8 or something equivalent rather than vfpv3-d16-fp16. Exact steps please. No one who's not well-versed in all the ARM variants will be able to figure this out. I've been able to generate identical before/after code, both with and without vselvs.f64 instructions, after trying out a number of switch combinations, but I've not been able to find a way to show where the patch makes a difference. I was just off-hand trying to give the options that would be expected to be exercised in this testcase. Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get the test failing before and after the patch. The code generated is vcmp.f64d0, d1 vmrsAPSR_nzcv, FPSCR vmovvs.f64 d0, d1 bx lr whereas the desired (e.g. with -mcpu=cortex-a57) is: vcmp.f64d0, d1 vmrsAPSR_nzcv, FPSCR vselvs.f64 d0, d1, d0 bx lr Given that VSEL is an ARMv8-A instruction and Cortex-A5 is an ARMv7-A cpu it doesn't make much sense to try getting it to generate that VSEL. So maybe we should just include an explicit -mtune=cortex-a57 to the testcases. Thanks, Kyrill Bernd
Re: [PATCH] Fix -Wimplicit-fallthrough in soft-fp (Re: implicit-fallthrough warnings in powerpc64le-linux GCC build)
On Tue, 24 Jan 2017, Jakub Jelinek wrote: > The following patch fixes the warnings (all the 5 fallthrus are intentional) > for me on x86_64-linux. But I assume it needs to go into glibc first, > right? > > 2017-01-24 Jakub Jelinek> > * soft-fp/op-common.h (_FP_MUL, _FP_FMA, _FP_DIV): Add > /* FALLTHRU */ comments. OK for glibc. -- Joseph S. Myers jos...@codesourcery.com
Re: [PR c++/78469] default ctor makes dtor required
On 01/24/2017 09:13 AM, Nathan Sidwell wrote: On 01/23/2017 05:01 PM, Jason Merrill wrote: I suppose adding a tsubst flag isn't too horrible. But then we also need to audit other uses of build_value_init to decide whether they should build a cleanup or not. @@ -8055,7 +8055,8 @@ build_over_call (struct z_candidate *can else if (default_ctor_p (fn)) { if (is_dummy_object (argarray[0])) - return force_target_expr (DECL_CONTEXT (fn), void_node, complain); + return force_target_expr (DECL_CONTEXT (fn), void_node, + complain | tf_no_cleanup); This is the wrong place for this; we don't know at this point whether we're in a new-expression or actually creating a temporary. I think we want to add this flag in the call to build_value_init from build_new_1. And look at other calls to build_value_init to see if they want it as well. Jason
[PATCH, rs6000] Fix for entries in table of overloaded built-in functions
GCC maintainers: The following patch fixes an issue with the entries in the table of built-in functions. All of the entries for a given built-in, must occur in the table as a single block of entries. Otherwise the code that searches the table for a given built-in definition will stop looking once it reaches the end of the initial block of definitions for that built-in function and subsequent definitions for that built-in will never be checked. This issue currently occurs with the ALTIVEC_BUILTIN_VEC_PACKS and P8V_BUILTIN_VEC_VGBBD built-ins. The patch simply moves the existing entries so the definition for a given built-in are all together in the same block of entries. Note this issue also exists with the GCC-5 and GCC-6 branches. The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE) with no regressions. Is the patch OK for trunk? Assuming this patch is OK, would it be acceptable to post a back port of the patch for GCC 5 and GCC 6 branches after the patch is in mainline as long as no issues are seen with this version in the mainline code base? Carl Love gcc/ChangeLog: 2017-01-23 Carl Love* config/rs6000/rs6000-c (altivec_overloaded_builtins): Fix order of entries for ALTIVEC_BUILTIN_VEC_PACKS and P8V_BUILTIN_VEC_VGBBD. --- gcc/config/rs6000/rs6000-c.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c index 8b87a0a..92e9849 100644 --- a/gcc/config/rs6000/rs6000-c.c +++ b/gcc/config/rs6000/rs6000-c.c @@ -2154,14 +2154,14 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_PACKS, ALTIVEC_BUILTIN_VPKSWSS, RS6000_BTI_V8HI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, - { ALTIVEC_BUILTIN_VEC_VPKSWSS, ALTIVEC_BUILTIN_VPKSWSS, -RS6000_BTI_V8HI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, - { ALTIVEC_BUILTIN_VEC_VPKUWUS, ALTIVEC_BUILTIN_VPKUWUS, -RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_PACKS, P8V_BUILTIN_VPKUDUS, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_PACKS, P8V_BUILTIN_VPKSDSS, RS6000_BTI_V4SI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, + { ALTIVEC_BUILTIN_VEC_VPKSWSS, ALTIVEC_BUILTIN_VPKSWSS, +RS6000_BTI_V8HI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_VPKUWUS, ALTIVEC_BUILTIN_VPKUWUS, +RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VPKSHSS, ALTIVEC_BUILTIN_VPKSHSS, RS6000_BTI_V16QI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_VPKUHUS, ALTIVEC_BUILTIN_VPKUHUS, @@ -4777,6 +4777,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 }, { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 }, + { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, +RS6000_BTI_V16QI, 0, 0, 0 }, + { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, +RS6000_BTI_unsigned_V16QI, 0, 0, 0 }, { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B, RS6000_BTI_V16QI, RS6000_BTI_V4SI, @@ -5038,11 +5042,6 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { P8V_BUILTIN_VEC_VUPKLSW, P8V_BUILTIN_VUPKLSW, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V4SI, 0, 0 }, - { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, -RS6000_BTI_V16QI, 0, 0, 0 }, - { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, -RS6000_BTI_unsigned_V16QI, 0, 0, 0 }, - { P9V_BUILTIN_VEC_VSLV, P9V_BUILTIN_VSLV, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, -- 1.9.1
Re: Improve things for PR71724, in combine/if_then_else_cond
On 01/24/2017 05:30 PM, Kyrill Tkachov wrote: The -mfpu is overridden in the testcase to add the ARMv8 instructions. So to reproduce the compilation in that testcase you'd want -mfpu=fp-armv8 or something equivalent rather than vfpv3-d16-fp16. Exact steps please. No one who's not well-versed in all the ARM variants will be able to figure this out. I've been able to generate identical before/after code, both with and without vselvs.f64 instructions, after trying out a number of switch combinations, but I've not been able to find a way to show where the patch makes a difference. Bernd
Re: Improve things for PR71724, in combine/if_then_else_cond
On 24/01/17 16:28, Christophe Lyon wrote: On 24 January 2017 at 17:02, Kyrill Tkachovwrote: On 24/01/17 15:21, Segher Boessenkool wrote: On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote: On 01/24/2017 09:38 AM, Christophe Lyon wrote: It seems that Bernd's patch causes regressions on arm-linux-gnueabihf --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 Maybe the upcoming patch from Segher intends to address this? Not really. Not at all, even, first I hear of this. Please open a PR. How exactly do I reproduce this - can you give me a command line and expected assembly output (I'm having some trouble figuring out the right set of switches)? Same here. I haven't looked at these in detail but these are the ARMv8 FP conditional select tests and they would be affected by if_then_else generation. So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5 would reproduce this. Maybe the Cortex-A5 BRANCH_COST affects things... Thanks Kyrill, and sorry for not replying earlier. Yes something like that. The configuration in question is configured with: -target=arm-none-linux-gnueabihf --with-float=hard --enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16 If you can't easily rebuild the whole toolchain, you can use an arm-linux-gnueabihf toolchain and use -mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16 -mfloat-abi=hard. The -mfpu is overridden in the testcase to add the ARMv8 instructions. So to reproduce the compilation in that testcase you'd want -mfpu=fp-armv8 or something equivalent rather than vfpv3-d16-fp16. Thanks, Kyrill It's not the first time that Cortex-A5's BRANCH_COST implies changes on a testcase. Christophe Kyrill Segher
Re: A + B CMP A -> A CMP' CST' match.pd patterns [was [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)]
On January 24, 2017 5:02:39 PM GMT+01:00, Marc Glissewrote: >On Tue, 24 Jan 2017, Jeff Law wrote: > >> But that would assume that match.pd is relying on range information >and could >> thus use the improved range information. *If* match.pd is using the >range >> information generated by VRP, it's not terribly pervasive. > >Oh, I thought we already had some explicit calls to get_range_info in >there, but apparently not. We can use VRP info through >get_nonzero_bits, >expr_not_equal_to and maybe one or two more like tree_expr_nonzero_p or We have the latter already. >tree_expr_nonnegative_p. If we called into match.pd from VRP, I assume >several of the simplify_*_using_ranges could be moved to match.pd. Yes, that will hopefully happen at some point but it also requires folding more stmts to not regress (I'd like to remove the stmt folding callback from the SSA propagator) Richard. But >I >agree that most transformations do not look at ranges at all.
Re: Improve things for PR71724, in combine/if_then_else_cond
On 24 January 2017 at 17:02, Kyrill Tkachovwrote: > > On 24/01/17 15:21, Segher Boessenkool wrote: >> >> On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote: >>> >>> On 01/24/2017 09:38 AM, Christophe Lyon wrote: It seems that Bernd's patch causes regressions on arm-linux-gnueabihf --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 Maybe the upcoming patch from Segher intends to address this? >>> >>> Not really. >> >> Not at all, even, first I hear of this. Please open a PR. >> >>> How exactly do I reproduce this - can you give me a command >>> line and expected assembly output (I'm having some trouble figuring out >>> the right set of switches)? >> >> Same here. > > > I haven't looked at these in detail but these are the ARMv8 FP conditional > select tests > and they would be affected by if_then_else generation. > > So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5 > would reproduce this. > Maybe the Cortex-A5 BRANCH_COST affects things... > Thanks Kyrill, and sorry for not replying earlier. Yes something like that. The configuration in question is configured with: -target=arm-none-linux-gnueabihf --with-float=hard --enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16 If you can't easily rebuild the whole toolchain, you can use an arm-linux-gnueabihf toolchain and use -mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16 -mfloat-abi=hard. It's not the first time that Cortex-A5's BRANCH_COST implies changes on a testcase. Christophe > Kyrill > > > >> >> Segher > >
Re: Improve things for PR71724, in combine/if_then_else_cond
On 24/01/17 15:21, Segher Boessenkool wrote: On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote: On 01/24/2017 09:38 AM, Christophe Lyon wrote: It seems that Bernd's patch causes regressions on arm-linux-gnueabihf --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 Maybe the upcoming patch from Segher intends to address this? Not really. Not at all, even, first I hear of this. Please open a PR. How exactly do I reproduce this - can you give me a command line and expected assembly output (I'm having some trouble figuring out the right set of switches)? Same here. I haven't looked at these in detail but these are the ARMv8 FP conditional select tests and they would be affected by if_then_else generation. So something like -mfpu=neon-fp-armv8 -mfloat-abi=hard -O2 -mcpu=cortex-a5 would reproduce this. Maybe the Cortex-A5 BRANCH_COST affects things... Kyrill Segher
Re: A + B CMP A -> A CMP' CST' match.pd patterns [was [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)]
On Tue, 24 Jan 2017, Jeff Law wrote: But that would assume that match.pd is relying on range information and could thus use the improved range information. *If* match.pd is using the range information generated by VRP, it's not terribly pervasive. Oh, I thought we already had some explicit calls to get_range_info in there, but apparently not. We can use VRP info through get_nonzero_bits, expr_not_equal_to and maybe one or two more like tree_expr_nonzero_p or tree_expr_nonnegative_p. If we called into match.pd from VRP, I assume several of the simplify_*_using_ranges could be moved to match.pd. But I agree that most transformations do not look at ranges at all. -- Marc Glisse
Re: [PATCH committed] Fix build failure with MPFR 2.4.x (gimple-ssa-sprintf.c)
On 01/24/2017 02:37 AM, Markus Trippelsdorf wrote: MPFR_RNDx was introduced in MPFR 3.0.0. Since the minimal version that gcc checks for is 2.4.0, this leads to a build failure. The fix is straightforward. Tested on x86_64-pc-linux-gnu. Committed to trunk as obvious. * gimple-ssa-sprintf.c (format_floating): Change MPFR_RNDx to GMP_RNDx for compatibility. This was changed once before for this reason (in r240350). I forgot all about it and put it back in my latest patch for some reason. I don't remember why exactly but I suspect I might have been trying to overcome some MPFR oddity. It might help to #undef the MPFR_RNDx macros after including to avoid this in the future. In any event, thanks for the fix. Martin
Fix microblaze-elf sprintf buffer overflow
This was found building config-list.mk last night using the trunk compiler. The buffer can clearly overflow for large label numbers. Confirmed that the microblaze target builds, and installed. Jeff commit 5b9382fcd400c8587667d7125f240ebec4ae2c81 Author: lawDate: Tue Jan 24 15:49:32 2017 + * config/microblaze/microblaze.h (ASM_FORMAT_PRIVATE_NAME): Increase buffer size. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244877 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/ChangeLog b/gcc/ChangeLog index f21bd83..1deec60 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2017-01-24 Jeff Law + + * config/microblaze/microblaze.h (ASM_FORMAT_PRIVATE_NAME): Increase + buffer size. + 2017-01-24 Bin Cheng PR tree-optimization/79159 diff --git a/gcc/config/microblaze/microblaze.h b/gcc/config/microblaze/microblaze.h index 8fdadbf..66e4ef5 100644 --- a/gcc/config/microblaze/microblaze.h +++ b/gcc/config/microblaze/microblaze.h @@ -757,7 +757,7 @@ do { \ an assembler-name for a local static variable named NAME. LABELNO is an integer which is different for each call. */ #define ASM_FORMAT_PRIVATE_NAME(OUTPUT, NAME, LABELNO) \ -( (OUTPUT) = (char *) alloca (strlen ((NAME)) + 10), \ +( (OUTPUT) = (char *) alloca (strlen ((NAME)) + 13), \ sprintf ((OUTPUT), "%s.%lu", (NAME), (unsigned long)(LABELNO))) /* How to start an assembler comment.
Re: [PATCH] PR libstdc++/79206 check string_view sizes in operator==
On 24 January 2017 at 16:46, Ville Voutilainenwrote: > On 24 January 2017 at 16:16, Jonathan Wakely wrote: >>> We do want it for basic_string as well, I think. And while I doubt >>> your interpretation >>> of the standard is pedantically correct, I also think that the >>> standard is broken if it >>> doesn't allow this optimization, and the standard should be fixed in that >>> case. >> But I think fixing this for std::basic_string should wait for stage 1. >> The basic_string_view fix I committed only touches TS & C++17 stuff. > > Release-management-wise I can't disagree with postponing the > basic_string change. However, I must > say that an implementation that doesn't check the sizes before > potentially comparing megabytes+10 bytes > of data just to then say that the strings are not equal because it > found a difference at megabytes+1 > is seriously problematic, and we should seriously entertain the idea > of just changing it and fixing the > standard later. Well, ok, as established in offline discussions, this problem affects strings that employ custom traits or a custom allocator. The fix can wait, there's no rush as such.
Re: [PR middle-end/79123] cast false positive in -Walloca-larger-than=
On 01/24/2017 03:15 AM, Richard Biener wrote: The more I look at our SCCVN implementation, the more I want to explore trying to re-use that infrastructure to simplify DOM. Certainly having a single way to hash/record stmts/expressions on GIMPLE would be nice. Not sure if the SCCVN one is perfect (enhancing the memory part further is on my TODO list). It doesn't have to be perfect. From my wanderings in its code I think it's probably sufficient to replace 90% of the non-path specific DOM optimizations. The idea is to do value numbering as an independent step (SCCVN, RPO walk, whatever). Then use a scheduler similar to Click's 95 work to place statements where they belong. The result is far simpler than what DOM does and re-uses the VN framework. In that world DOM morphs away from generic CSE/simplifications and focuses on path specific stuff. And *that* should be reimplemented as backwards walk with value numbering. It's hard to tease that out right now given how we depend on DOM for generic CSE/simplifications. Jeff
Re: Improve things for PR71724, in combine/if_then_else_cond
On Tue, Jan 24, 2017 at 01:40:46PM +0100, Bernd Schmidt wrote: > On 01/24/2017 09:38 AM, Christophe Lyon wrote: > >It seems that Bernd's patch causes regressions on arm-linux-gnueabihf > >--with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: > > > > gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 > > gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 > > gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 > > gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 > > > >Maybe the upcoming patch from Segher intends to address this? > > Not really. Not at all, even, first I hear of this. Please open a PR. > How exactly do I reproduce this - can you give me a command > line and expected assembly output (I'm having some trouble figuring out > the right set of switches)? Same here. Segher
Re: A + B CMP A -> A CMP' CST' match.pd patterns [was [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)]
On 01/24/2017 07:29 AM, Marc Glisse wrote: On Tue, 24 Jan 2017, Richard Biener wrote: That was my thought as well, but AFAICT we only call into match.pd from VRP if we changed the insn. Yes - there was thoughts to change that (but it comes at an expense). Basically we'd like to re-fold stmts that indirectly use stmts we changed. We certainly don't want to re-fold everything all the time. VRP is kind of a special case, every variable for which it finds a new/improved range could be considered changed, since it may trigger some extra transformation in match.pd (same for CCP and the nonzero mask). But that would assume that match.pd is relying on range information and could thus use the improved range information. *If* match.pd is using the range information generated by VRP, it's not terribly pervasive. But waiting until forwprop3 means we're leaving a ton of useless blocks and statements in the IL for this testcase, and likely other code using std::vec. Perhaps rather than open-coding a fix in VRP I could have VRP call into match.pd slightly more aggressively (say just for gimple_cond). That may be enough to capture the effects much earlier in the pipeline without trying to fold *everything*. Jeff
Re: [PATCH] PR libstdc++/79206 check string_view sizes in operator==
On Tue, 24 Jan 2017, Jonathan Wakely wrote: I've just committed this, and then noticed that we don't do the same optimization for basic_string unless the char_type is char. Note the "see also" field in that PR ;-) -- Marc Glisse
Re: [PATCH] PR libstdc++/79206 check string_view sizes in operator==
On 24 January 2017 at 16:16, Jonathan Wakelywrote: >> We do want it for basic_string as well, I think. And while I doubt >> your interpretation >> of the standard is pedantically correct, I also think that the >> standard is broken if it >> doesn't allow this optimization, and the standard should be fixed in that >> case. > But I think fixing this for std::basic_string should wait for stage 1. > The basic_string_view fix I committed only touches TS & C++17 stuff. Release-management-wise I can't disagree with postponing the basic_string change. However, I must say that an implementation that doesn't check the sizes before potentially comparing megabytes+10 bytes of data just to then say that the strings are not equal because it found a difference at megabytes+1 is seriously problematic, and we should seriously entertain the idea of just changing it and fixing the standard later.
Re: A + B CMP A -> A CMP' CST' match.pd patterns [was [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)]
On Tue, 24 Jan 2017, Richard Biener wrote: That was my thought as well, but AFAICT we only call into match.pd from VRP if we changed the insn. Yes - there was thoughts to change that (but it comes at an expense). Basically we'd like to re-fold stmts that indirectly use stmts we changed. We certainly don't want to re-fold everything all the time. VRP is kind of a special case, every variable for which it finds a new/improved range could be considered changed, since it may trigger some extra transformation in match.pd (same for CCP and the nonzero mask). -- Marc Glisse
Re: [PR 79108] Put ipa_node_params to GC memory
Hi Martin, >> this broke bootstrap on i386-pc-solaris2.12, sparc-sun-solaris2.12, and >> i686-pc-linux-gnu (probably every 32-bit host), as confirmed by an >> i386-pc-solaris2.10 reghunt. >> >> E.g. in the Linux/i686 bootstrap, compiling tree-ssa-math-opts.o fails >> with >> >> virtual memory exhausted: Cannot allocate memory >> >> In the Solaris cases, various libstdc++ source files fail to compile in >> the same way. > > This is most likely PR 79198 for which I have just committed a fix. it was: was both sparc-sun-solaris2.12 and i386-pc-solaris2.12 bootstraps are back to normal again. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] PR libstdc++/79206 check string_view sizes in operator==
On 24/01/17 14:54 +0200, Ville Voutilainen wrote: On 24 January 2017 at 14:05, Jonathan Wakelywrote: I've just committed this, and then noticed that we don't do the same optimization for basic_string unless the char_type is char. Presumably this is so that we do call basic_string::compare() and so call any user-defined traits_type::eq() function (which is observable). I don't think that's necessary, because the standard says operator== on strings returns the result of lhs.compare(rhs) == 0, not that it's "Equivalent to" such a call. As long as we give the right answer, I don't think we need to make an explicit call to basic_string::compare(). So maybe we want this for basic_string too. We do want it for basic_string as well, I think. And while I doubt your interpretation of the standard is pedantically correct, I also think that the standard is broken if it doesn't allow this optimization, and the standard should be fixed in that case. I agree. The new overload that only applies to character types was added to fix PR 39207, which makes the same observation as the string_view PR I just fixed. Paolo's patch fixing it confirms my presumption: https://gcc.gnu.org/ml/libstdc++/2007-07/msg00043.html "Per the standard, in general the equality operator is required to simply return string::compare." Again, I think if the standard means operator== must be equivalent to calling string::compare (including all visible side effects, algorithmic complexity etc.) then it should use the magic words "Equivalent to". It doesn't use those words, so my reading is that what it's required to return is the same boolean value as you would get from lhs.compare(rhs) == 0, by whatever means. I'll raise a defect to clarify this. If LWG get the answer wrong and say it does require a call to compare then I'll file another defect saying it should be changed to: "Equivalent to: lhs.size() == rhs.size() && lhs.compare(rhs) == 0;" But I think fixing this for std::basic_string should wait for stage 1. The basic_string_view fix I committed only touches TS & C++17 stuff.
[PATCH][testsuite] Require shared effective target for some lto.exp tests
Hi all, The tests in this patch fail for me on aarch64-none-elf with: relocation R_AARCH64_ADR_PREL_PG_HI21 against external symbol `_impure_ptr' can not be used when making a shared object; recompile with -fPIC I believe since the tests pass -shared to the linker they should be gated on the 'shared' effective target? With this patch these tests appear as UNSUPPORTED on aarch64-none-elf rather than FAILing. Ok for trunk? Thanks, Kyrill 2016-01-24 Kyrylo Tkachov* gcc.dg/lto/pr54709_0.c: Require 'shared' effective target. * gcc.dg/lto/pr61526_0.c: Likewise. * gcc.dg/lto/pr64415_0.c: Likewise. diff --git a/gcc/testsuite/gcc.dg/lto/pr54709_0.c b/gcc/testsuite/gcc.dg/lto/pr54709_0.c index f3db5dc..69697d8 100644 --- a/gcc/testsuite/gcc.dg/lto/pr54709_0.c +++ b/gcc/testsuite/gcc.dg/lto/pr54709_0.c @@ -1,6 +1,7 @@ /* { dg-lto-do link } */ /* { dg-require-visibility "hidden" } */ /* { dg-require-effective-target fpic } */ +/* { dg-require-effective-target shared } */ /* { dg-extra-ld-options { -shared } } */ /* { dg-lto-options { { -fPIC -fvisibility=hidden -flto } } } */ diff --git a/gcc/testsuite/gcc.dg/lto/pr61526_0.c b/gcc/testsuite/gcc.dg/lto/pr61526_0.c index 8a631f0..d3e2c80 100644 --- a/gcc/testsuite/gcc.dg/lto/pr61526_0.c +++ b/gcc/testsuite/gcc.dg/lto/pr61526_0.c @@ -1,4 +1,5 @@ /* { dg-require-effective-target fpic } */ +/* { dg-require-effective-target shared } */ /* { dg-lto-do link } */ /* { dg-lto-options { { -fPIC -flto -flto-partition=1to1 } } } */ /* { dg-extra-ld-options { -shared } } */ diff --git a/gcc/testsuite/gcc.dg/lto/pr64415_0.c b/gcc/testsuite/gcc.dg/lto/pr64415_0.c index 4faab2b..11218e0 100644 --- a/gcc/testsuite/gcc.dg/lto/pr64415_0.c +++ b/gcc/testsuite/gcc.dg/lto/pr64415_0.c @@ -1,5 +1,6 @@ /* { dg-lto-do link } */ /* { dg-require-effective-target fpic } */ +/* { dg-require-effective-target shared } */ /* { dg-lto-options { { -O -flto -fpic } } } */ /* { dg-extra-ld-options { -shared } } */ /* { dg-extra-ld-options "-Wl,-undefined,dynamic_lookup" { target *-*-darwin* } } */
Re: [PR c++/78469] default ctor makes dtor required
On 01/23/2017 05:01 PM, Jason Merrill wrote: I suppose adding a tsubst flag isn't too horrible. But then we also need to audit other uses of build_value_init to decide whether they should build a cleanup or not. Not too ugly, I guess. Looking at the other calls that end up at build_target_expr, the only one I found that looked like it might need it is also in call.c: if (is_dummy_object (fa)) { if (TREE_CODE (arg) == TARGET_EXPR) return arg; else if (trivial) return force_target_expr (DECL_CONTEXT (fn), arg, complain); but I convinced myself it didn't. ok? nathan -- Nathan Sidwell 2017-01-24 Nathan SidwellPR c++/78469 - defaulted ctor and inaccessible dtor * cp-tree.h (tsubst_flags): Add tf_no_cleanup. * call.c (build-over_call): Pass tf_no_cleanup for ctor in new-expr. * tree.c (build_target_expr): Check tf_no_cleanup. PR c++/78469 * g++.dg/cpp0x/pr78469.C: New. Index: cp/call.c === --- cp/call.c (revision 244869) +++ cp/call.c (working copy) @@ -8055,7 +8055,8 @@ build_over_call (struct z_candidate *can else if (default_ctor_p (fn)) { if (is_dummy_object (argarray[0])) - return force_target_expr (DECL_CONTEXT (fn), void_node, complain); + return force_target_expr (DECL_CONTEXT (fn), void_node, + complain | tf_no_cleanup); else return cp_build_indirect_ref (argarray[0], RO_NULL, complain); } Index: cp/cp-tree.h === --- cp/cp-tree.h (revision 244869) +++ cp/cp-tree.h (working copy) @@ -4786,6 +4786,8 @@ enum tsubst_flags { substitution in fn_type_unification. */ tf_fndecl_type = 1 << 9, /* Substituting the type of a function declaration. */ + tf_no_cleanup = 1 << 10, /* Do not build a cleanup +(build_target_expr and friends) */ /* Convenient substitution flags combinations. */ tf_warning_or_error = tf_warning | tf_error }; Index: cp/tree.c === --- cp/tree.c (revision 244869) +++ cp/tree.c (working copy) @@ -404,9 +404,15 @@ build_target_expr (tree decl, tree value || useless_type_conversion_p (TREE_TYPE (decl), TREE_TYPE (value))); - t = cxx_maybe_build_cleanup (decl, complain); - if (t == error_mark_node) -return error_mark_node; + if (complain & tf_no_cleanup) +/* The caller is building a new-expr and does not need a cleanup. */ +t = NULL_TREE; + else +{ + t = cxx_maybe_build_cleanup (decl, complain); + if (t == error_mark_node) + return error_mark_node; +} t = build4 (TARGET_EXPR, type, decl, value, t, NULL_TREE); if (EXPR_HAS_LOCATION (value)) SET_EXPR_LOCATION (t, EXPR_LOCATION (value)); Index: testsuite/g++.dg/cpp0x/pr78469.C === --- testsuite/g++.dg/cpp0x/pr78469.C (revision 0) +++ testsuite/g++.dg/cpp0x/pr78469.C (working copy) @@ -0,0 +1,14 @@ +// { dg-do compile { target c++11 } } +// PR78469, bogus error about inaccessible dtor. + +struct no_destr { + no_destr() = default; + +protected: + ~no_destr() = default; +}; + +void *Foo () +{ + return new no_destr (); +}
Re: [PATCH 1/4] BRIG (HSAIL) frontend: configuration file changes and misc
On 16.05.2016 19:25, Pekka Jääskeläinen wrote: > The configuration file changes and misc. updates required > by the BRIG frontend. > > Also, added include/hsa-interface.h which is hsa.h taken from libgomp > and will be shared by it (agreed with Martin Liška / SUSE). > the toplevel configure.ac has: # Disable the BRIG frontend and libhsail-rt on untested or known # broken systems. Currently it has been tested only on x86_64 Linux # of the upstream gcc targets. More targets shall be added after testing. case "${target}" in x86_64-*-linux*) This matches the x86_64-linux-gnux32 target as well. Is this intended?
Re: [PATCH][wwwdocs] Mention new store merging pass for GCC 7
On 23/01/17 16:45, Gerald Pfeifer wrote: > Hi Kyrill, > > On Mon, 23 Jan 2017, Kyrill Tkachov wrote: >> This patch adds a short entry for the store merging pass in GCC 7 to the >> "General Optimizer Improvements" section. > > + A new store merging pass has been added. It will attempt to merge > + constant stores to adjacent memory locations into fewer wider stores. > + It can be enabled by using the -fstore-merging option > and is > + enabled by default at the -O2 optimization level or > + higher. I also think you should either use 'fewer, wider, stores' (with commas) or, if you don't like the commas: 'a smaller number of wider stores'. R. > > Here I'd say "it attempts to merge" or, better yet, let's just say > "it merges". > > Let's not be too shy. :-) (This still does not claim that it always > succeeds or anything like that, mind.) > > Okay, with that note taken into consideration. > > Thanks, > Gerald
C++ Modules branch
As some have already noticed, I created a c++-modules branch yesterday. Don't get too excited, that doesn't mean I have an implementation to commit there. I expect several false starts, and don't intend to post patches here. Wiki page is https://gcc.gnu.org/wiki/cxx-modules nathan -- Nathan Sidwell
Re: [PATCH] PR libstdc++/79206 check string_view sizes in operator==
On 24 January 2017 at 14:05, Jonathan Wakelywrote: > I've just committed this, and then noticed that we don't do the same > optimization for basic_string unless the char_type is char. Presumably > this is so that we do call basic_string::compare() and so call any > user-defined traits_type::eq() function (which is observable). I don't > think that's necessary, because the standard says operator== on > strings returns the result of lhs.compare(rhs) == 0, not that it's > "Equivalent to" such a call. As long as we give the right answer, I > don't think we need to make an explicit call to > basic_string::compare(). So maybe we want this for basic_string too. We do want it for basic_string as well, I think. And while I doubt your interpretation of the standard is pedantically correct, I also think that the standard is broken if it doesn't allow this optimization, and the standard should be fixed in that case.
Re: [PATCH] BRIG frontend: request for a global review
Hi, On Mon, Jan 23, 2017 at 02:11:37PM +0100, Richard Biener wrote: > On Mon, Jan 23, 2017 at 1:02 PM, Martin Jamborwrote: > > Hi, > > > > > > On Mon, Jan 23, 2017 at 12:56:13PM +0100, Richard Biener wrote: > >> On Fri, Jan 20, 2017 at 6:25 PM, Pekka Jääskeläinen > >> wrote: > >> > Hi Richard, > >> > > >> > On Fri, Jan 20, 2017 at 10:26 AM, Richard Biener > >> > wrote: > >> >> So the #ifdef ENABLE_BRIG_FE shouldn't be needed anymore (nor the > >> >> configury for it). > >> >> > >> >> Otherwise this looks ok to me then. > >> > > >> > Attached is a patch set with that unnecessary ENABLE_BRIG_FE macro > >> > removed. It has also been refreshed to the latest trunk. > >> > >> Are you set up to commit these to trunk yourself or do you need help here? > >> > > > > Pekka is still in the process of requesting an account at gcc.gnu.org. > > We agreed that I would commit the patches. > > > > Over Saturday and today morning, I have bootstrappet and tested them > > in various configurations (although mostly on x86_64 and only a little > > bit on aarch64) and (together with a "svn mv libgomp/plugin/hsa.h > > include/") I have not found any issue so far. > > > > I suppose I should go ahead (in any case, if there is any fallout, > > reach out to me too, at least until Pekka gets his account). > > Yes, sounds good to me. > It took me more time because I got distracted by PR 79198 and did not want to commit another huge thing while bootstrap was still broken for so many people with my previous patch. Nevertheless, I have just committed the BRIG FE as revision 244867. I tried to be careful (as well as quick to avoid conflicts) so I hope there are no issues. However if there are some, throw complaints my way at least unless Pekka gets an account at gcc.gnu.org. Thanks, Martin
Re: Improve things for PR71724, in combine/if_then_else_cond
On 01/24/2017 09:38 AM, Christophe Lyon wrote: It seems that Bernd's patch causes regressions on arm-linux-gnueabihf --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 Maybe the upcoming patch from Segher intends to address this? Not really. How exactly do I reproduce this - can you give me a command line and expected assembly output (I'm having some trouble figuring out the right set of switches)? Bernd
[PATCH] Fix PR79208
Committed as obvious. Richard. 2017-01-24 Richard BienerPR translation/79208 * ipa-devirt.c (odr_types_equivalent_p): Fix typo in diagnostic. Index: gcc/ipa-devirt.c === --- gcc/ipa-devirt.c(revision 244865) +++ gcc/ipa-devirt.c(working copy) @@ -1628,7 +1628,7 @@ odr_types_equivalent_p (tree t1, tree t2 if (DECL_VIRTUAL_P (f1) != DECL_VIRTUAL_P (f2)) { warn_odr (t1, t2, f1, f2, warn, warned, - G_("s definition that differs by virtual " + G_("a definition that differs by virtual " "keyword in another translation unit")); return false; }
Re: [C++ PATCH] 79118 bitfields & constexpr
On Tue, Jan 24, 2017 at 07:01:45AM -0500, Nathan Sidwell wrote: > On 01/24/2017 05:49 AM, Jakub Jelinek wrote: > > > ((BIT_FIELD_REF ^ BIT_FIELD_REF ) & 110) == 0 > > out of that. So unless we DTRT (i.e. save constexpr bodies before > > cp_fold for constexpr evaluation purposes), the workaround would need > > to handle this properly (basically pattern recognize whatever the > > for avoidance of doubt, I'm arguing that such folding is premature in the > face of contexpr. I'm arguing that pretty much all folding is premature in the face of constexpr. We e.g. accept: constexpr int a[2] = { 1, 2 }; constexpr int foo (const int *x, int y) { return x[y] & 0; } constexpr int b = foo (a, 1); constexpr int c = foo (a, 2); // { dg-error "" } constexpr int d = foo (a, 3); // { dg-error "" } because we do constexpr evaluation on folded trees, while clang++ properly rejects it for c and d. Jakub
Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undef SSA_NAMEs (take 2)
On Mon, Jan 23, 2017 at 6:26 PM, Aldy Hernandezwrote: > On 01/18/2017 10:10 AM, Richard Biener wrote: >> >> On Fri, Jan 13, 2017 at 7:48 PM, Aldy Hernandez wrote: > > > Hi Richard. > > I'd be happy to provide a patch, but could you please elaborate, as I'm > afraid I'm not following. > >>> We could go back to my original, no caching version (with the other >>> suggestions). That solves the problem quite simply, with no regressions. >> >> >> So let's go with a unswitching-local solution then. Based on >> tree_may_unswitch_on: > > > What do you mean by unswitching-local? Like your original patch, not adding new generic infrastructure outside of tree-ssa-unswitch.c. >> >> /* Condition must be invariant. */ >> FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE) >> { >> /* Unswitching on undefined values would introduce undefined >> behavior that the original program might never exercise. */ >> if (ssa_undefined_value_p (use, true)) >> return NULL_TREE; >> def = SSA_NAME_DEF_STMT (use); >> def_bb = gimple_bb (def); >> if (def_bb >> && flow_bb_inside_loop_p (loop, def_bb)) >> return NULL_TREE; >> >> we only have to look for uses in blocks dominating the loop header block >> (or in blocks post-dominating that loop header, but we can probably >> implement >> that by simply including the loop header itself with a FIXME comment). > > > Look for *uses*?? Do you mean definitions? I mean, we're trying to figure > out whether we are going to unswitch on a use that is inside a the loop, not > before or after. So perhaps we only care about *definitions* > (SSA_NAME_DEF_STMT) dominating the loop header. We're looking for stmts using the 'use's in the above loop on a path that is always executed when the loop is entered. So we're not introducing a use of a possibly undefined value. Of course we can also prove that 'use' is in fact not undefined looking at its defs which are obviously always dominating the loop header if the condition satisfies tree_may_unswitch_on (non-dominating defs will have uses in PHIs dominating the header which we have to treat conservatively of course). > >> >> Note that we only need to know whether a BB will be always executed when >> the loop is entered -- there's "infrastructure" elsewhere that computes >> this >> w/o the need of post-dominance. For example see fill_always_executed_in_1 >> tree-ssa-loop-im.c (we can't use that 1:1 I think because we already use >> ->aux >> via the original copy tables, but we could simplify it as we're only >> interested in >> the loop which preheader we put the unswitch condition on so we can use >> a bitmap to record whether a block of the loop is always executed or not). > > > I don't see any use of ->aux within loop unswitching, so perhaps no > adjustment is needed? I verified this by successfully bootstrapping with: > > diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c > index 92599fb..774d6bf 100644 > --- a/gcc/tree-ssa-loop-unswitch.c > +++ b/gcc/tree-ssa-loop-unswitch.c > @@ -94,6 +94,14 @@ tree_ssa_unswitch_loops (void) >struct loop *loop; >bool changed = false; > > + basic_block bb; > + FOR_ALL_BB_FN (bb, cfun) > +if (bb->aux) > + { > + gcc_unreachable (); > + } > > Furthermore, you say that we have this "infrastructure" without the need for > post-dominance. But we still need dominance info. The function > fill_always_execute_in_1 uses CDI_DOMINATORS both inside said function, and > throughout its dependencies. I thought the point of pre-calculating > dominance info (or post-dominance info) originally was because we couldn't > depend on dominance info being kept up to date between executions of > tree_unswitch_single_loop(), which BTW, runs recursively. dominators are kept up-to-date within cfg manipulation routines and unswitching already uses them. So if you just try to prove 'use' is defined you don't even need dominators. But that misses cases like int x; foo () { for (;;) { if (x == 5) ...; } } where unswitching is valid because x is always used when the loop is entered. Similar int x, a[10]; foo (int c) { foo (x); for (i=0;i > Can you send a patch that does this maybe? > > > Sure, when I understand what you suggest ;-). > > Again, thanks for your input here. > Aldy >
[PATCH] PR libstdc++/79206 check string_view sizes in operator==
I've just committed this, and then noticed that we don't do the same optimization for basic_string unless the char_type is char. Presumably this is so that we do call basic_string::compare() and so call any user-defined traits_type::eq() function (which is observable). I don't think that's necessary, because the standard says operator== on strings returns the result of lhs.compare(rhs) == 0, not that it's "Equivalent to" such a call. As long as we give the right answer, I don't think we need to make an explicit call to basic_string::compare(). So maybe we want this for basic_string too. PR libstdc++/79206 * include/experimental/string_view (operator==): Check sizes first. * include/std/string_view (operator==): Likewise. Tested powerpc64le-linux, committed to trunk. commit 131743b521289b143ef1e85a309f43d742e55481 Author: Jonathan WakelyDate: Tue Jan 24 11:48:56 2017 + PR libstdc++/79206 check string_view sizes in operator== PR libstdc++/79206 * include/experimental/string_view (operator==): Check sizes first. * include/std/string_view (operator==): Likewise. diff --git a/libstdc++-v3/include/experimental/string_view b/libstdc++-v3/include/experimental/string_view index eaff0cc..2a2364c 100644 --- a/libstdc++-v3/include/experimental/string_view +++ b/libstdc++-v3/include/experimental/string_view @@ -456,19 +456,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION inline bool operator==(basic_string_view<_CharT, _Traits> __x, basic_string_view<_CharT, _Traits> __y) noexcept -{ return __x.compare(__y) == 0; } +{ return __x.size() == __y.size() && __x.compare(__y) == 0; } template inline bool operator==(basic_string_view<_CharT, _Traits> __x, __detail::__idt > __y) noexcept -{ return __x.compare(__y) == 0; } +{ return __x.size() == __y.size() && __x.compare(__y) == 0; } template inline bool operator==(__detail::__idt > __x, basic_string_view<_CharT, _Traits> __y) noexcept -{ return __x.compare(__y) == 0; } +{ return __x.size() == __y.size() && __x.compare(__y) == 0; } template inline bool diff --git a/libstdc++-v3/include/std/string_view b/libstdc++-v3/include/std/string_view index 9eee528..a719185 100644 --- a/libstdc++-v3/include/std/string_view +++ b/libstdc++-v3/include/std/string_view @@ -453,19 +453,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION inline bool operator==(basic_string_view<_CharT, _Traits> __x, basic_string_view<_CharT, _Traits> __y) noexcept -{ return __x.compare(__y) == 0; } +{ return __x.size() == __y.size() && __x.compare(__y) == 0; } template inline bool operator==(basic_string_view<_CharT, _Traits> __x, __detail::__idt > __y) noexcept -{ return __x.compare(__y) == 0; } +{ return __x.size() == __y.size() && __x.compare(__y) == 0; } template inline bool operator==(__detail::__idt > __x, basic_string_view<_CharT, _Traits> __y) noexcept -{ return __x.compare(__y) == 0; } +{ return __x.size() == __y.size() && __x.compare(__y) == 0; } template inline bool
Re: [C++ PATCH] 79118 bitfields & constexpr
On 01/24/2017 05:49 AM, Jakub Jelinek wrote: ((BIT_FIELD_REF ^ BIT_FIELD_REF ) & 110) == 0 out of that. So unless we DTRT (i.e. save constexpr bodies before cp_fold for constexpr evaluation purposes), the workaround would need to handle this properly (basically pattern recognize whatever the for avoidance of doubt, I'm arguing that such folding is premature in the face of contexpr. nathan -- Nathan Sidwell
[arm-embedded][committed] Update Coprocessor Intrinsics code after mainline changes
Hi, I committed this patch to the embedded-6-branch to update this branch's version of the Coprocessor Intrinsics implementation. The code committed earlier to implement the Coprocessor Intrinsics was based on a version of the mainline patch that had not been upstreamed yet and that patch changed since then, this patch makes the necessary changes such that they are equivalent. gcc/ChangeLog.arm: 2017-01-24 Andre Vieira* config/arm/arm.md (*ldcstc): Split into ... (*ldc): ... this and ... (*stc): ... this. (ldcstc): Split into ... (ldc): ... this and ... (stc): ... this. (cdp,*ldc,*stc,mrc,mcr,mrrc,mcrr): Add operand constraints. (mrc, mrrc): Add source mode to coprocessor pattern SETs. * config/arm/arm.c (arm_coproc_builtin_available): Put function name on new line and fix availability of MCRR2 and MRRC2 builtins. (arm_coproc_ldc_stc_legitimate_address): Put function name on new line. * config/arm/arm-builtins.c (arm_type_qualifiers): Style fix. * config/arm/arm_acle.h: Fix availability of __arm_mcrr2 and __arm_mrrc2 intrinsics. * config/arm/constraints.md (Uz): Finish sentence explaining the constraint. * config/arm/iterators.md (LDCSTCI,LDCSTC,ldcstc): Split into ... (LDCI,LDC,ldc): ... this and ... (STCI,STC,stc): ... this. * gcc/doc/sourcebuild.texi (arm_coproc2_ok,arm_coproc3_ok): Fix language. (arm_coproc4_ok): New. gcc/testsuite/ChangeLog.arm 2017-01-24 Andre Vieira * lib/target-supports.exp (arm_coproc2_ok,arm_coproc3_ok): Fix language in comments. (arm_coproc4_ok): New. diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index f569dd98ac7092248aa91d3ad2aee9921d3d0859..ca622519b7de95a2585caa0db6e5591dba30b73e 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -49,7 +49,7 @@ enum arm_type_qualifiers qualifier_const = 0x2, /* 1 << 1 */ /* T *foo. */ qualifier_pointer = 0x4, /* 1 << 2 */ - /* const T * foo */ + /* const T * foo. */ qualifier_const_pointer = 0x6, /* Used when expanding arguments if an operand could be an immediate. */ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index a76c950c53dba315ea051cd451a64173025b89d9..418f1eabfc4f057f33dfc941e99b8292a7f3fd5e 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -31506,7 +31506,8 @@ arm_elf_section_type_flags (tree decl, const char *name, int reloc) false otherwise. If a BUILTIN is passed for which this function has not been implemented it will cause an exception. */ -bool arm_coproc_builtin_available (enum unspecv builtin) +bool +arm_coproc_builtin_available (enum unspecv builtin) { /* None of these builtins are available in Thumb mode if the target only supports Thumb-1. */ @@ -31538,14 +31539,17 @@ bool arm_coproc_builtin_available (enum unspecv builtin) return true; break; case VUNSPEC_MCRR: - case VUNSPEC_MCRR2: case VUNSPEC_MRRC: - case VUNSPEC_MRRC2: /* Only present in ARMv5TE, ARMv6 (but not ARMv6-M), ARMv7* and ARMv8-{A,M}. */ if (arm_arch6 || arm_arch5te) return true; break; + case VUNSPEC_MCRR2: + case VUNSPEC_MRRC2: + if (arm_arch6) + return true; + break; default: gcc_unreachable (); } @@ -31555,7 +31559,8 @@ bool arm_coproc_builtin_available (enum unspecv builtin) /* This function returns true if OP is a valid memory operand for the ldc and stc coprocessor instructions and false otherwise. */ -bool arm_coproc_ldc_stc_legitimate_address (rtx op) +bool +arm_coproc_ldc_stc_legitimate_address (rtx op) { int range; /* Has to be a memory operand. */ @@ -31585,7 +31590,7 @@ bool arm_coproc_ldc_stc_legitimate_address (rtx op) range = INTVAL (op); /* Within the range of [-1020,1020]. */ - if (range < -1020 || range > 1020) + if (!IN_RANGE (range, -1020, 1020)) return false; /* And a multiple of 4. */ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 28f063c7a6bad7554b969518540b2869334ac7f8..3ff77f2bf09734f8a113ce969efa959f185ec443 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -11496,12 +11496,12 @@ }) (define_insn "" - [(unspec_volatile [(match_operand:SI 0 "immediate_operand") -(match_operand:SI 1 "immediate_operand") -(match_operand:SI 2 "immediate_operand") -(match_operand:SI 3 "immediate_operand") -(match_operand:SI 4 "immediate_operand") -(match_operand:SI 5 "immediate_operand")] CDPI)] + [(unspec_volatile [(match_operand:SI 0 "immediate_operand" "n") +
Re: [C++ PATCH] 79118 bitfields & constexpr
On Tue, Jan 24, 2017 at 11:54:09AM +0100, Richard Biener wrote: > > Please note that say for: > > struct S { int : 1; int a : 3; int : 1; int b : 2; }; > > > > int > > foo (S a, S b) > > { > > return a.a == b.a && a.b == b.b; > > } > > > > (haven't tried to turn that into a constexpr testcase, but shouldn't be > > hard), the folding creates > > ((BIT_FIELD_REF ^ BIT_FIELD_REF ) & 110) == 0 > > out of that. So unless we DTRT (i.e. save constexpr bodies before > > cp_fold for constexpr evaluation purposes), the workaround would need > > to handle this properly (basically pattern recognize whatever the > > folding may create for the bitfield tests and undo it or track bitwise which > > bits are known to be constexpr and which bits are undefined and during > > the BIT_AND_EXPR testing verify it isn't asking for any bits that aren't > > constexpr). And that there can be multiple bitfields covered by the same > > BIT_FIELD_REF. > > Just chiming in to say (again) this folding is bad and/or premature. A way > to avoid it for C++ would be to move it to gimplification? It won't > ever trigger > on GIMPLE. Not that I like gimplification specific "simplifications" too > much. Sure, it won't hurt to move this optimizations later if it will not be much harder, on the other side, there are many other reasons why constexpr evaluation should be done on unfolded bodies (e.g., if there is an out of bound array access or some other undefined behavior somewhere where it is folded away (say (i = 5; j = arr[i] & 0;) on int arr[3];), C++ should still error out during constexpr evaluation of it, while with folded trees it just won't see it. Jakub
Re: [C++ PATCH] 79118 bitfields & constexpr
On Tue, Jan 24, 2017 at 11:49 AM, Jakub Jelinekwrote: > On Mon, Jan 23, 2017 at 08:49:43AM -0500, Nathan Sidwell wrote: >> This patch addresses 79118, where we ICE on a constexpr involving bitfields >> in an unnamed struct (unnamed struct members are a g++ extension). >> >> This is really a band-aid, because our internal representation BITFIELD_REF >> and the (premature) optimizations it encourages is incompatible with >> constexpr requirements. There's already signs of that with Jakub's code to >> deal with fold turning things like: >> int foo: 5; >> int baz: 3; >> ... >> ptr->baz == cst >> turns into the moral equivalent of (little endian example) >> ((*(unsigned char *)ptr & 0xe0) == (cst << 5) >> >> In this particular case we've also made the base object the containing >> class, not the unnamed struct member. That means we're looking in the wrong >> CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true. Whereas for the >> subobj's CONSTRUCTOR it is false. With this patch we end up thinking this >> is an OK constexpr of value zero, whereas it's actually an uninitialized >> bitfield read. >> >> But I think we could already make that mistake given the above example & >> fold's behaviour. If 'ptr->foo' has been initialized, we'll construct a >> value using just that field and think we have a valid initialization of >> ptr->baz too. >> >> The equivalent testcase using non-bitfields has a base object of the unnamed >> struct member, and behaves correctly (given this is an extension anyway). >> >> The right solution is to fix the IR. In the C++ FE have BITFIELD_REF (or a >> new node) look much more like COMPONENT_REF (or even be COMPONENT_REF, but I >> suspect lots of places think ADDR (COMPONENT_REF (...)) is legit). And >> lower it to the middle-end generic representation in cp_genericize. I don't >> think this is the right time for a change of that magnitude though. >> >> Perhaps for now we should just always err on the side of safety, and behave >> as-if uninitialized if we fall of the end of looking for a bitfield? > > Please note that say for: > struct S { int : 1; int a : 3; int : 1; int b : 2; }; > > int > foo (S a, S b) > { > return a.a == b.a && a.b == b.b; > } > > (haven't tried to turn that into a constexpr testcase, but shouldn't be > hard), the folding creates > ((BIT_FIELD_REF ^ BIT_FIELD_REF ) & 110) == 0 > out of that. So unless we DTRT (i.e. save constexpr bodies before > cp_fold for constexpr evaluation purposes), the workaround would need > to handle this properly (basically pattern recognize whatever the > folding may create for the bitfield tests and undo it or track bitwise which > bits are known to be constexpr and which bits are undefined and during > the BIT_AND_EXPR testing verify it isn't asking for any bits that aren't > constexpr). And that there can be multiple bitfields covered by the same > BIT_FIELD_REF. Just chiming in to say (again) this folding is bad and/or premature. A way to avoid it for C++ would be to move it to gimplification? It won't ever trigger on GIMPLE. Not that I like gimplification specific "simplifications" too much. Richard. > > Jakub
Re: [PATCH PR79159]Fix spurious array bound warning.
On Tue, Jan 24, 2017 at 11:12 AM, Bin Chengwrote: > Hi, > Given test as reported in PR79159: > > void foo(float tmpCorr[9][9]); > float bar; > > void finalDigits(int& n) > { > float tmpCorr[9][9] = {{0}}; > > foo(tmpCorr); > for (int i = 0; i < n; i++) { > for (int j = i+1; j < n; j++) { > bar = tmpCorr[i][j]; > } > } > } > Pass cunrolli unrolls the inner loop with unrolling number 9 which is > inferred from bound of local array definition: "tmpCorr[9][9]". In fact, it > only needs to be unrolled by 8 times because the starting value of "j" is 1. > However, loop niter analyzer fails to compute the accurate niter bound > because cunrolli is before vrp pass and it doesn't know anything about outer > loop's induction variable in inner loop handling. This patch computes init > value of induction variable and uses that to improve boundary analysis. > Bootstrap and test on x86_64 and AArch64. Is it OK? Ok. Thanks, Richard. > Thanks, > bin > > 2017-01-23 Bin Cheng > > PR tree-optimization/79159 > * tree-ssa-loop-niter.c (get_cst_init_from_scev): New function. > (record_nonwrapping_iv): Imporve boundary using above function if no > value range information. > > gcc/testsuite/ChangeLog > 2017-01-23 Bin Cheng > > PR tree-optimization/79159 > * g++.dg/tree-ssa/pr79159.C: New test.
Re: [C++ PATCH] 79118 bitfields & constexpr
On Mon, Jan 23, 2017 at 08:49:43AM -0500, Nathan Sidwell wrote: > This patch addresses 79118, where we ICE on a constexpr involving bitfields > in an unnamed struct (unnamed struct members are a g++ extension). > > This is really a band-aid, because our internal representation BITFIELD_REF > and the (premature) optimizations it encourages is incompatible with > constexpr requirements. There's already signs of that with Jakub's code to > deal with fold turning things like: > int foo: 5; > int baz: 3; > ... > ptr->baz == cst > turns into the moral equivalent of (little endian example) > ((*(unsigned char *)ptr & 0xe0) == (cst << 5) > > In this particular case we've also made the base object the containing > class, not the unnamed struct member. That means we're looking in the wrong > CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true. Whereas for the > subobj's CONSTRUCTOR it is false. With this patch we end up thinking this > is an OK constexpr of value zero, whereas it's actually an uninitialized > bitfield read. > > But I think we could already make that mistake given the above example & > fold's behaviour. If 'ptr->foo' has been initialized, we'll construct a > value using just that field and think we have a valid initialization of > ptr->baz too. > > The equivalent testcase using non-bitfields has a base object of the unnamed > struct member, and behaves correctly (given this is an extension anyway). > > The right solution is to fix the IR. In the C++ FE have BITFIELD_REF (or a > new node) look much more like COMPONENT_REF (or even be COMPONENT_REF, but I > suspect lots of places think ADDR (COMPONENT_REF (...)) is legit). And > lower it to the middle-end generic representation in cp_genericize. I don't > think this is the right time for a change of that magnitude though. > > Perhaps for now we should just always err on the side of safety, and behave > as-if uninitialized if we fall of the end of looking for a bitfield? Please note that say for: struct S { int : 1; int a : 3; int : 1; int b : 2; }; int foo (S a, S b) { return a.a == b.a && a.b == b.b; } (haven't tried to turn that into a constexpr testcase, but shouldn't be hard), the folding creates ((BIT_FIELD_REF ^ BIT_FIELD_REF ) & 110) == 0 out of that. So unless we DTRT (i.e. save constexpr bodies before cp_fold for constexpr evaluation purposes), the workaround would need to handle this properly (basically pattern recognize whatever the folding may create for the bitfield tests and undo it or track bitwise which bits are known to be constexpr and which bits are undefined and during the BIT_AND_EXPR testing verify it isn't asking for any bits that aren't constexpr). And that there can be multiple bitfields covered by the same BIT_FIELD_REF. Jakub
Re: A + B CMP A -> A CMP' CST' match.pd patterns [was [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)]
On Tue, Jan 24, 2017 at 1:03 AM, Jeff Lawwrote: > On 01/21/2017 01:00 AM, Marc Glisse wrote: >> >> On Fri, 20 Jan 2017, Jeff Law wrote: >> >>> On 01/20/2017 04:30 PM, Jeff Law wrote: Going to work from the self-contained test... > > Here's a test case that's closer to the one from the bug. It also > ends up with the out of bounds memset even at -O1, during PRE. > > typedef __SIZE_TYPE__ size_t; > > struct S > int *p0, *p1, *p2; > > size_t size () const { return p1 - p0; } > > void f (size_t n) { > if (n > size ()) // can't happen because > foo (n - size ()); // n is in [1, MIN(size() - 1, 3)] > else if (n < size ()) > bar (p0 + n); > } > > void foo (size_t n) > { > size_t left = (size_t)(p2 - p1); > if (left >= n) > __builtin_memset (p2, 0, n * sizeof *p2); > } > > void bar (int*); > }; > > void f (S ) > { > size_t n = s.size (); > if (n > 1 && n < 5) > s.f (n - 1); > } Looking at the .vrp1 dump for this test we find: ;; basic block 3, loop depth 0, count 0, freq 3664, maybe hot ;;prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) ;;pred: 2 [36.6%] (TRUE_VALUE,EXECUTABLE) _18 = ASSERT_EXPR <_13, _13 + 18446744073709551614 <= 2>; _2 = _18 + 18446744073709551615; if (_2 > _18) goto ; [50.00%] else goto ; [50.00%] _2 > _18 is the same as (_18 - 1) > _18 Which is only true if _18 is zero. And since _18 has a computed range of [2..4], that can never happen. VRP doesn't currently handle this case, though we do have some code to turn relational comparisons into equality comparisons. If I manually turn the relational into _18 == 0 at that point, then the next VRP pass is able to optimize the conditional away which in turn eliminates the problematical memset call & warning. So one possible approach would be to treat simplify_cond_using_ranges to simplify that kind of condition. I don't know if that would fix the reported testcase as I haven't looked at it directly under the debugger yet. >>> >>> In fact, match.pd already has a pattern to handle a relational like >>> (X + -1) > X >>> >>> The pattern doesn't fire because of a single_use guard. This was >>> discussed back in 2015 with Jakub advocating for a single_use guard, >>> but Marc leaning the other direction, but not enough to argue too much >>> about it. >>> >>> In early 2016 Marc actually submitted the match.pd patch with the >>> single use guard per Jakub's preference. >>> >>> The single use guard checks the SSA_NAME holding the (X + CST) >>> expression for single use. It's not entirely clear why we check for >>> single use. If it's because of object lifetimes, then that's a total >>> red herring here. >>> >>> Given something like: >>> x = a + c >>> if (a > x) >>> >>> The transformation into >>> >>> x = a + c; >>> if (a < c') >>> >>> Where x has multiple uses, the transformation will not inherently >>> change the lifetime of "x" worse (if the conditional was the last use, >>> then the lifetime of x may shorten somewhat). If there were uses >>> after the condition, the lifetime of "x" remains unchanged. >> >> >> I don't remember the conversation and I don't have experience with >> register pressure. My assumption would be that the issue is with the new >> constant c'. In the first case, during the comparison, only a and x are >> alive, while in the second case, c' is as well, which may use an extra >> register. Does that make sense? >> >> (I don't know if that's bad enough to inhibit the simplification) >> >> Or are we talking about the transformation that has this comment in >> match.pd: >> >>Currently restricted to single use so as not to interfere too much with >>ADD_OVERFLOW detection in tree-ssa-math-opts.c. >> >> That detection logic needs to be improved anyway (see also a discussion >> with Vincent Lefevre yesterday on gcc-help, users do write the >> simplified form directly), the single_use is supposedly only there until >> that is done (really depends on how well the improved logic works...). >> >>> "a" was already live from the assignment to at least the conditional. >>> But again, we haven't inherently changed its lifetime. >>> >>> However, what can happen is after transformation, the assignment might >>> sink to a later use point in the CFG, which might lengthen the >>> lifetime of "a", but it's also shortening the lifetime of "x" by the >>> same amount. >>> >>> So, ultimately I don't buy the argument that we should inhibit this >>> based on register lifetime issues. >>> >>> Jakub, perhaps you remember why you wanted this restricted to cases >>> where "x" has a single use? > > So I went and
[PATCH] Fix -Wimplicit-fallthrough in soft-fp (Re: implicit-fallthrough warnings in powerpc64le-linux GCC build)
Hi! On Tue, Jan 24, 2017 at 10:59:38AM +0100, Sebastian Huber wrote: ... > mulkf3-sw.c:48:3: note: in expansion of macro ‘FP_MUL_Q’ >FP_MUL_Q (R, A, B); >^~~~ > /home/sh/gcc-git/libgcc/soft-fp/op-common.h:913:10: warning: this statement > may fall through [-Wimplicit-fallthrough=] > R##_s = Y##_s; \ > /home/sh/gcc-git/libgcc/soft-fp/quad.h:308:29: note: in expansion of macro > ‘_FP_MUL’ > # define FP_MUL_Q(R, X, Y) _FP_MUL (Q, 2, R, X, Y) > ^~~ > mulkf3-sw.c:48:3: note: in expansion of macro ‘FP_MUL_Q’ >FP_MUL_Q (R, A, B); >^~~~ > /home/sh/gcc-git/libgcc/soft-fp/op-common.h:915:2: note: here > case _FP_CLS_COMBINE (FP_CLS_NORMAL, FP_CLS_INF): \ > ^ > /home/sh/gcc-git/libgcc/soft-fp/quad.h:308:29: note: in expansion of macro > ‘_FP_MUL’ > # define FP_MUL_Q(R, X, Y) _FP_MUL (Q, 2, R, X, Y) > ^~~ > mulkf3-sw.c:48:3: note: in expansion of macro ‘FP_MUL_Q’ >FP_MUL_Q (R, A, B); >^~~ > > I don't know this code enough to fix them. The following patch fixes the warnings (all the 5 fallthrus are intentional) for me on x86_64-linux. But I assume it needs to go into glibc first, right? 2017-01-24 Jakub Jelinek* soft-fp/op-common.h (_FP_MUL, _FP_FMA, _FP_DIV): Add /* FALLTHRU */ comments. --- libgcc/soft-fp/op-common.h.jj 2016-08-17 10:22:22.0 +0200 +++ libgcc/soft-fp/op-common.h 2017-01-24 11:18:40.284703176 +0100 @@ -1,5 +1,5 @@ /* Software floating-point emulation. Common operations. - Copyright (C) 1997-2016 Free Software Foundation, Inc. + Copyright (C) 1997-2017 Free Software Foundation, Inc. This file is part of the GNU C Library. Contributed by Richard Henderson (r...@cygnus.com), Jakub Jelinek (j...@ultra.linux.cz), @@ -898,6 +898,7 @@ case _FP_CLS_COMBINE (FP_CLS_NAN, FP_CLS_INF): \ case _FP_CLS_COMBINE (FP_CLS_NAN, FP_CLS_ZERO): \ R##_s = X##_s;\ + /* FALLTHRU */\ \ case _FP_CLS_COMBINE (FP_CLS_INF, FP_CLS_INF): \ case _FP_CLS_COMBINE (FP_CLS_INF, FP_CLS_NORMAL): \ @@ -911,6 +912,7 @@ case _FP_CLS_COMBINE (FP_CLS_INF, FP_CLS_NAN): \ case _FP_CLS_COMBINE (FP_CLS_ZERO, FP_CLS_NAN): \ R##_s = Y##_s;\ + /* FALLTHRU */\ \ case _FP_CLS_COMBINE (FP_CLS_NORMAL, FP_CLS_INF): \ case _FP_CLS_COMBINE (FP_CLS_NORMAL, FP_CLS_ZERO): \ @@ -1063,6 +1065,7 @@ case _FP_CLS_COMBINE (FP_CLS_NAN, FP_CLS_INF): \ case _FP_CLS_COMBINE (FP_CLS_NAN, FP_CLS_ZERO): \ _FP_FMA_T##_s = X##_s;\ + /* FALLTHRU */\ \ case _FP_CLS_COMBINE (FP_CLS_INF, FP_CLS_INF): \ case _FP_CLS_COMBINE (FP_CLS_INF, FP_CLS_NORMAL): \ @@ -1076,6 +1079,7 @@ case _FP_CLS_COMBINE (FP_CLS_INF, FP_CLS_NAN): \ case _FP_CLS_COMBINE (FP_CLS_ZERO, FP_CLS_NAN): \ _FP_FMA_T##_s = Y##_s;\ + /* FALLTHRU */\ \ case _FP_CLS_COMBINE (FP_CLS_NORMAL, FP_CLS_INF): \ case _FP_CLS_COMBINE (FP_CLS_NORMAL, FP_CLS_ZERO): \ @@ -1198,6 +1202,7 @@ \ case _FP_CLS_COMBINE (FP_CLS_NORMAL, FP_CLS_ZERO): \ FP_SET_EXCEPTION (FP_EX_DIVZERO); \ + /* FALLTHRU */\ case _FP_CLS_COMBINE (FP_CLS_INF, FP_CLS_ZERO): \ case _FP_CLS_COMBINE (FP_CLS_INF, FP_CLS_NORMAL): \ R##_c = FP_CLS_INF; \ Jakub
Re: [PATCH v2][PR lto/79061] Fix LTO plus ASAN fails with "AddressSanitizer: initialization-order-fiasco".
On Tue, Jan 24, 2017 at 01:10:46PM +0300, Maxim Ostapenko wrote: > gcc/fortran/ChangeLog: > > 2017-01-24 Maxim Ostapenko> > * f95-lang.c (gfc_create_decls): Include stringpool.h. > Pass main_input_filename to build_translation_unit_decl. > > gcc/ada/ChangeLog: > > 2017-01-24 Maxim Ostapenko > > * gcc-interface/utils.c (get_global_context): pass main_input_filename > to build_translation_unit_decl. s/pass/Pass/, after ): ChangeLog entries start with capital letter. > gcc/c/ChangeLog: > > 2017-01-24 Maxim Ostapenko > > * c-decl.c (pop_scope): pass main_input_filename to > build_translation_unit_decl. > > gcc/cp/ChangeLog: > > 2017-01-24 Maxim Ostapenko > > * decl.c (cxx_init_decl_processing): pass main_input_filename > to build_translation_unit_decl. Likewise 2x. > > gcc/ChangeLog: > > 2017-01-24 Maxim Ostapenko > > * asan.c (get_translation_unit_decl): New function. > (asan_add_global): Extract modules file name from globals > TRANSLATION_UNIT_DECL name. Ok for trunk with that. Jakub
Re: [PATCH v2][PR lto/79061] Fix LTO plus ASAN fails with "AddressSanitizer: initialization-order-fiasco".
On Tue, 24 Jan 2017, Maxim Ostapenko wrote: > Hi, > > as Richard pointed out in previous mail, v1 patch regresses compile-time by > 100% on some fortran cases (PR 79165). > Doing the linemap operation in build_translation_unit_decl confuses the > linemap code too much, thus we propagate main_input_filename through DECL_NAME > of TRANSLATION_UNIT_DECL instead of creating new source location. > OK to install if regtest and bootstrap complete successfully? I've also > checked that new patch doesn't introduce compile-time regression on aermod.f90 > testcase. + const_tree translation_unit_decl = get_translation_unit_decl (decl); + if (translation_unit_decl) + filename = IDENTIFIER_POINTER (DECL_NAME (translation_unit_decl)); please also check DECL_NAME is not NULL. Ok with that change. Thanks, Richard. > Thanks, > -Maxim > -- Richard BienerSUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PR middle-end/79123] cast false positive in -Walloca-larger-than=
On Mon, Jan 23, 2017 at 5:07 PM, Jeff Lawwrote: > On 01/23/2017 02:50 AM, Richard Biener wrote: So ideally DOM and EVRP would merge and CCP, copyprop and VRP would merge. It's probably not possible (or wise) to merge their lattices (maybe to some extent) >>> >>> >>> DOM's equivalency tables could be easily superceded by other >>> implementations. It's just two datastructures. A trivial const/copy >>> table >>> and a stack to allow unwinding the state of the const/copy table. >>> Throwing >>> away the basic const/copy table and replacing it with something built by >>> copyprop ought to be trivial. >>> >>> The big thing that would be left would be the scoped hash table for >>> tracking >>> redundant expressions. But I don't think that we'd necessarily have to >>> rip >>> that out to merge DOM with one of hte other passes. >>> >>> Hell we know DOM can fit well in any dominator based walk -- we used to >>> do >>> it as part of the into-ssa transformation. >> >> >> Sure. >> >> The question is whether we want to make warning "passes" more expensive >> by essentially doing a [E]VRP/DOM pass (but not doing any IL transform). >> >> I believe doing that is more reasonable than doing their own hacky >> analysis. >> >> Having less passes to choose to copy from for such static analysis (and >> the >> ability to tame compile-time by some switches) would be a good thing to >> have. > > Very true. But your comment seemed to be about merging DOM and EVRP or > other passes -- which is certainly possible and possibly even desirable. > > The more I look at our SCCVN implementation, the more I want to explore > trying to re-use that infrastructure to simplify DOM. Certainly having a single way to hash/record stmts/expressions on GIMPLE would be nice. Not sure if the SCCVN one is perfect (enhancing the memory part further is on my TODO list). But I also want to merge SCCVN and DOM in the end. Well, rip out the 'SCC' part of SCCVN and replace it with a RPO style VN which makes it very similar to DOM. Ideally we'd have one "VN" we can configure (like enable a VRP lattice, disable memory handling to get a pure CCP, disable optimistic VN and thus require no iteration, etc.). It may make sense to keep (a similarly single!) SSA-propagator based VN (ccp/copyprop/VRP). Richard. > > Jeff
[PATCH PR79159]Fix spurious array bound warning.
Hi, Given test as reported in PR79159: void foo(float tmpCorr[9][9]); float bar; void finalDigits(int& n) { float tmpCorr[9][9] = {{0}}; foo(tmpCorr); for (int i = 0; i < n; i++) { for (int j = i+1; j < n; j++) { bar = tmpCorr[i][j]; } } } Pass cunrolli unrolls the inner loop with unrolling number 9 which is inferred from bound of local array definition: "tmpCorr[9][9]". In fact, it only needs to be unrolled by 8 times because the starting value of "j" is 1. However, loop niter analyzer fails to compute the accurate niter bound because cunrolli is before vrp pass and it doesn't know anything about outer loop's induction variable in inner loop handling. This patch computes init value of induction variable and uses that to improve boundary analysis. Bootstrap and test on x86_64 and AArch64. Is it OK? Thanks, bin 2017-01-23 Bin ChengPR tree-optimization/79159 * tree-ssa-loop-niter.c (get_cst_init_from_scev): New function. (record_nonwrapping_iv): Imporve boundary using above function if no value range information. gcc/testsuite/ChangeLog 2017-01-23 Bin Cheng PR tree-optimization/79159 * g++.dg/tree-ssa/pr79159.C: New test.diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr79159.C b/gcc/testsuite/g++.dg/tree-ssa/pr79159.C new file mode 100644 index 000..e15e117 --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr79159.C @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -Wall" } */ + +void foo(float tmpCorr[9][9]); +float bar; + +void finalDigits(int& n) +{ + float tmpCorr[9][9] = {{0}}; + + foo(tmpCorr); + for (int i = 0; i < n; i++) { +for (int j = i+1; j < n; j++) { + bar = tmpCorr[i][j]; +} + } +} + diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 8328625..efcf3ed 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -3059,6 +3059,43 @@ record_control_iv (struct loop *loop, struct tree_niter_desc *niter) return; } +/* This function returns TRUE if below conditions are satisfied: + 1) VAR is SSA variable. + 2) VAR is an IV:{base, step} in its defining loop. + 3) IV doesn't overflow. + 4) Both base and step are integer constants. + 5) Base is the MIN/MAX value depends on IS_MIN. + Store value of base to INIT correspondingly. */ + +static bool +get_cst_init_from_scev (tree var, wide_int *init, bool is_min) +{ + if (TREE_CODE (var) != SSA_NAME) +return false; + + gimple *def_stmt = SSA_NAME_DEF_STMT (var); + struct loop *loop = loop_containing_stmt (def_stmt); + + if (loop == NULL) +return false; + + affine_iv iv; + if (!simple_iv (loop, loop, var, , false)) +return false; + + if (!iv.no_overflow) +return false; + + if (TREE_CODE (iv.base) != INTEGER_CST || TREE_CODE (iv.step) != INTEGER_CST) +return false; + + if (is_min == tree_int_cst_sign_bit (iv.step)) +return false; + + *init = iv.base; + return true; +} + /* Record the estimate on number of iterations of LOOP based on the fact that the induction variable BASE + STEP * i evaluated in STMT does not wrap and its values belong to the range . REALISTIC is true if the @@ -3100,7 +3137,8 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree step, gimple *stmt, if (TREE_CODE (orig_base) == SSA_NAME && TREE_CODE (high) == INTEGER_CST && INTEGRAL_TYPE_P (TREE_TYPE (orig_base)) - && get_range_info (orig_base, , ) == VR_RANGE + && (get_range_info (orig_base, , ) == VR_RANGE + || get_cst_init_from_scev (orig_base, , false)) && wi::gts_p (high, max)) base = wide_int_to_tree (unsigned_type, max); else if (TREE_CODE (base) != INTEGER_CST @@ -3117,7 +3155,8 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree step, gimple *stmt, if (TREE_CODE (orig_base) == SSA_NAME && TREE_CODE (low) == INTEGER_CST && INTEGRAL_TYPE_P (TREE_TYPE (orig_base)) - && get_range_info (orig_base, , ) == VR_RANGE + && (get_range_info (orig_base, , ) == VR_RANGE + || get_cst_init_from_scev (orig_base, , true)) && wi::gts_p (min, low)) base = wide_int_to_tree (unsigned_type, min); else if (TREE_CODE (base) != INTEGER_CST
Re: [PR middle-end/79123] cast false positive in -Walloca-larger-than=
On Mon, Jan 23, 2017 at 4:55 PM, Martin Seborwrote: >>> Yes, I see that. But when I change size_t to unsigned int (in LP64) >>> like so: >>> >>> void g (int *p, int *q) >>> { >>> unsigned n = (unsigned)(p - q); >>> >>> if (n < 10) >>> f (__builtin_alloca (n)); >>> } >>> >>> -Walloca-larger-than=100 still complains: >>> >>> warning: argument to ‘alloca’ may be too large >>> note: limit is 100 bytes, but argument may be as large as 4294967295 >>> >>> and the VRP dump shows >>> >>> _5: [0, 4294967295] >>> _14: [0, 4294967295] >>> ... >>> _3 = p.0_1 - q.1_2; >>> _4 = _3 /[ex] 4; >>> n_10 = (unsigned int) _4; >>> if (n_10 <= 9) >>> goto ; [36.64%] >>> else >>> goto ; [63.36%] >>> >>> [36.64%]: >>> _14 = _4 & 4294967295; >>> _5 = (long unsigned int) _14; >>> _6 = __builtin_alloca (_5); >>> >>> so it seems that even in VRP itself the range information isn't >>> quite good enough to avoid false positives. (Perhaps that's one >>> the missed cases you mention below?) >> >> >> Well, you see that there's no obvious way to use the n_10 <= 9 conditional >> here. VRP would need to register a [0, 9] range for the lower half of >> n_10 >> and then figure after seeing >> >>> _14 = _4 & 4294967295; >> >> >> that _14 is now [0, 9]. >> >> That's a neat trick VRP cannot handle at the moment (there's no way the >> lattice can encode a range for a sub-set of all bits of a SSA name). > > > Sure. My point is just that it looks like there will be some false > positives whether the warnings are implemented as part of the VRP > pass or elsewhere. (I admit I haven't studied the implementation > of the VRP pass to understand whether these false positives are > avoidable and I'm happy to take your word if if you say they can.) > >> Your source is bogus in the way that (unsigned)(p - q) might truncate >> the pointer difference. > > > The problem is independent of the pointer difference and can be > reproduced by any conversion from a wider type to a narrower > unsigned type (even the safe ones), as the test case in bug 79191 > I opened for it shows: > > void f (unsigned long n) > { > if (n > 3) > __builtin_abort (); > } > > void h (unsigned long m) > { > unsigned n = m; > > if (n < 3) > f (n); > } Sure, but that's essentially the same issue: [100.00%]: n_2 = (unsigned int) m_1(D); if (n_2 <= 2) goto ; [54.00%] else goto ; [46.00%] [54.00%]: m_6 = ASSERT_EXPR ; _5 = m_6 & 4294967295; if (_5 > 3) we can't express that m_6 has a known range for bits 0..31. But of course this shows that the canonicalization of truncation to bit-and is bad or happening too early. w/o it we have [0.00%]: n_4 = (unsigned int) m_3(D); if (n_4 <= 2) goto ; [0.00%] else goto ; [0.00%] [100.00%]: _1 = (long unsigned int) n_4; if (_1 > 3) which we can perfectly optimize (a much loved single-use test could have us here). Index: gcc/match.pd === --- gcc/match.pd(revision 244859) +++ gcc/match.pd(working copy) @@ -1840,7 +1840,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && final_int && inter_int && inside_int && final_prec == inside_prec && final_prec > inter_prec -&& inter_unsignedp) +&& inter_unsignedp +&& single_use (@1)) (convert (bit_and @0 { wide_int_to_tree (inside_type, wi::mask (inter_prec, false, Richard. > Martin >
[PATCH v2][PR lto/79061] Fix LTO plus ASAN fails with "AddressSanitizer: initialization-order-fiasco".
Hi, as Richard pointed out in previous mail, v1 patch regresses compile-time by 100% on some fortran cases (PR 79165). Doing the linemap operation in build_translation_unit_decl confuses the linemap code too much, thus we propagate main_input_filename through DECL_NAME of TRANSLATION_UNIT_DECL instead of creating new source location. OK to install if regtest and bootstrap complete successfully? I've also checked that new patch doesn't introduce compile-time regression on aermod.f90 testcase. Thanks, -Maxim gcc/fortran/ChangeLog: 2017-01-24 Maxim Ostapenko* f95-lang.c (gfc_create_decls): Include stringpool.h. Pass main_input_filename to build_translation_unit_decl. gcc/ada/ChangeLog: 2017-01-24 Maxim Ostapenko * gcc-interface/utils.c (get_global_context): pass main_input_filename to build_translation_unit_decl. gcc/c/ChangeLog: 2017-01-24 Maxim Ostapenko * c-decl.c (pop_scope): pass main_input_filename to build_translation_unit_decl. gcc/cp/ChangeLog: 2017-01-24 Maxim Ostapenko * decl.c (cxx_init_decl_processing): pass main_input_filename to build_translation_unit_decl. gcc/ChangeLog: 2017-01-24 Maxim Ostapenko * asan.c (get_translation_unit_decl): New function. (asan_add_global): Extract modules file name from globals TRANSLATION_UNIT_DECL name. diff --git a/gcc/ada/gcc-interface/utils.c b/gcc/ada/gcc-interface/utils.c index 0ae381f..3cda631 100644 --- a/gcc/ada/gcc-interface/utils.c +++ b/gcc/ada/gcc-interface/utils.c @@ -666,7 +666,8 @@ get_global_context (void) { if (!global_context) { - global_context = build_translation_unit_decl (NULL_TREE); + global_context + = build_translation_unit_decl (get_identifier (main_input_filename)); debug_hooks->register_main_translation_unit (global_context); } diff --git a/gcc/asan.c b/gcc/asan.c index 486ebfd..e1475bd 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -2373,6 +2373,22 @@ asan_needs_odr_indicator_p (tree decl) && TREE_PUBLIC (decl)); } +/* For given DECL return its corresponding TRANSLATION_UNIT_DECL. */ + +static const_tree +get_translation_unit_decl (tree decl) +{ + const_tree context = decl; + while (context && TREE_CODE (context) != TRANSLATION_UNIT_DECL) +{ + if (TREE_CODE (context) == BLOCK) + context = BLOCK_SUPERCONTEXT (context); + else + context = get_containing_scope (context); +} + return context; +} + /* Append description of a single global DECL into vector V. TYPE is __asan_global struct type as returned by asan_global_struct. */ @@ -2392,7 +2408,14 @@ asan_add_global (tree decl, tree type, vec *v) pp_string (_pp, ""); str_cst = asan_pp_string (_pp); - pp_string (_name_pp, main_input_filename); + const char *filename = main_input_filename; + if (in_lto_p) +{ + const_tree translation_unit_decl = get_translation_unit_decl (decl); + if (translation_unit_decl) + filename = IDENTIFIER_POINTER (DECL_NAME (translation_unit_decl)); +} + pp_string (_name_pp, filename); module_name_cst = asan_pp_string (_name_pp); if (asan_needs_local_alias (decl)) diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c index 2f91e70..32edacc 100644 --- a/gcc/c/c-decl.c +++ b/gcc/c/c-decl.c @@ -1177,7 +1177,8 @@ pop_scope (void) context = current_function_decl; else if (scope == file_scope) { - tree file_decl = build_translation_unit_decl (NULL_TREE); + tree file_decl + = build_translation_unit_decl (get_identifier (main_input_filename)); context = file_decl; debug_hooks->register_main_translation_unit (file_decl); } diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 792ebcc..af74dcd 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -4071,7 +4071,8 @@ cxx_init_decl_processing (void) gcc_assert (global_namespace == NULL_TREE); global_namespace = build_lang_decl (NAMESPACE_DECL, global_scope_name, void_type_node); - DECL_CONTEXT (global_namespace) = build_translation_unit_decl (NULL_TREE); + DECL_CONTEXT (global_namespace) += build_translation_unit_decl (get_identifier (main_input_filename)); debug_hooks->register_main_translation_unit (DECL_CONTEXT (global_namespace)); TREE_PUBLIC (global_namespace) = 1; diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c index 98ef837..44bd8dc 100644 --- a/gcc/fortran/f95-lang.c +++ b/gcc/fortran/f95-lang.c @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3. If not see #include "tree.h" #include "gfortran.h" #include "trans.h" +#include "stringpool.h" #include "diagnostic.h" /* For errorcount/warningcount */ #include "langhooks.h" #include "langhooks-def.h" @@ -190,7 +191,8 @@ gfc_create_decls (void) gfc_init_constants (); /* Build our translation-unit decl. */ - current_translation_unit = build_translation_unit_decl (NULL_TREE); +
Re: [PR 79108] Put ipa_node_params to GC memory
Hi, On Mon, Jan 23, 2017 at 11:05:03PM +0100, Rainer Orth wrote: > Hi Martin, > > > when I fixed PR 78365 by streaming types of parameters that might not > > have been anywhere else, I forgot that I was holding them in non-GC > > memory and so I caused PR 79108. The following patch fixes it by > > putting ipa_param_descriptor and ipa_node_params structures into GC > > memory, together with summary holding class ipa_node_params_t which is > > now a GC root. In the process, I have turned the destructor of the > > latter into a remove and insert hooks of the summary holder. > > > > Bootstrapped and tested on x86_64-linux, also LTO-bootstrapped and > > tested, although with only C,C++ and Fortran and with sanitizers and > > multilib disabled. (It is now a part of an undergoing LTO bootstrap > > with everything). OK for trunk? > > this broke bootstrap on i386-pc-solaris2.12, sparc-sun-solaris2.12, and > i686-pc-linux-gnu (probably every 32-bit host), as confirmed by an > i386-pc-solaris2.10 reghunt. > > E.g. in the Linux/i686 bootstrap, compiling tree-ssa-math-opts.o fails > with > > virtual memory exhausted: Cannot allocate memory > > In the Solaris cases, various libstdc++ source files fail to compile in > the same way. > > Rainer > This is most likely PR 79198 for which I have just committed a fix. Please let me know if any issues persist. Thanks, Martin
[PR 79198] Call ipa-prop func summary destructor
Hi, the first of the following two patches fixes PR 79198. The issue is that early inliner creates ipa_node_params_sum then and calls ipa_free_all_node_params to release it many times (I guess for each function) and because I forgot to call the destructor in the function, we ended up with multitudes of simultaneously active duplication cgraph hooks and each invocation doubled the vectors it was written to copy. Calling the destructor avoids this because it de-registers the hooks. In the next stage1 we should figure out why early inliner needs the summary and avoid it. One thing that troubles me is that if I attempt to ggc_free ipa_node_params_sum after calling the destructor, I end up with random segfault later on in the compilation. I will try to investigate a bit more but I do not have a very good comprehension of how GC and C++ classes mix. The second patch is not necessary to restore profiledbootstrap and bootstrap on many platforms, but when I looked at the code, I was afraid the two fields might not be initialized correctly, so I'd like to commit it to trunk too. Both patches have survived bootstrap, lto bootstrap and testing on x86_64 and bootstrap on i686. On IRC, Markus and David confirmed the first patches fixes the issue for them too. Because of that and because the fix is rather simple, I am going to commit both patches as obvious. Thanks and sorry for the breakage, Martin [PR 79198] Call ipa-prop func summary destructor 2017-01-23 Martin JamborPR bootstrap/79198 * ipa-prop.c (ipa_free_all_node_params): Call summary destructor. --- gcc/ipa-prop.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index 512bcbed0cb..22cd61f3b18 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -3574,6 +3574,7 @@ ipa_free_all_edge_args (void) void ipa_free_all_node_params (void) { + ipa_node_params_sum->~ipa_node_params_t (); ipa_node_params_sum = NULL; } -- 2.11.0 [PR 79198] Init vectors in ipa_node_params_t::insert 2017-01-23 Martin Jambor PR bootstrap/79198 * ipa-prop.c (insert): Initialize fields known_csts and known_contexts. --- gcc/ipa-prop.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index 22cd61f3b18..834c27d100e 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -3743,6 +3743,8 @@ ipa_node_params_t::insert (cgraph_node *, ipa_node_params *info) { info->lattices = NULL; info->ipcp_orig_node = NULL; + info->known_csts = vNULL; + info->known_contexts = vNULL; info->analysis_done = 0; info->node_enqueued = 0; info->do_clone_for_all_contexts = 0; -- 2.11.0
[PATCH committed] Fix build failure with MPFR 2.4.x (gimple-ssa-sprintf.c)
MPFR_RNDx was introduced in MPFR 3.0.0. Since the minimal version that gcc checks for is 2.4.0, this leads to a build failure. The fix is straightforward. Tested on x86_64-pc-linux-gnu. Committed to trunk as obvious. * gimple-ssa-sprintf.c (format_floating): Change MPFR_RNDx to GMP_RNDx for compatibility. diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c index 66edc3ec1cc9..283b6251d743 100644 --- a/gcc/gimple-ssa-sprintf.c +++ b/gcc/gimple-ssa-sprintf.c @@ -1560,7 +1560,7 @@ format_floating (const directive , tree arg) rounding in either direction can result in longer output. */ mpfr_t mpfrval; mpfr_init2 (mpfrval, rfmt->p); - mpfr_from_real (mpfrval, rvp, i ? MPFR_RNDU : MPFR_RNDD); + mpfr_from_real (mpfrval, rvp, i ? GMP_RNDU : GMP_RNDD); /* Use the MPFR rounding specifier to round down in the first iteration and then up. In most but not all cases this will -- Markus
Re: Improve things for PR71724, in combine/if_then_else_cond
On 20 January 2017 at 20:24, Segher Boessenkoolwrote: > Hi Bernd, > > On Fri, Jan 20, 2017 at 01:33:59PM +0100, Bernd Schmidt wrote: >> So, when looking for situations where we have only one condition, we can >> try to undo the conversion of a plain REG into a condition, on the >> grounds that this is probably less helpful. >> >> This seems to cure the testcase, but Segher also has a patch in the PR >> that looks like a good and more direct approach. IMO both should be >> applied. This one was bootstrapped and tested on x86_64-linux. Ok? > > My patch does not cure all problems, it simply simplifies things a bit > better; but the same is true for your patch if I read it correctly. > > Okay for trunk, and I'll do my half as well. Thanks, > > Hi, It seems that Bernd's patch causes regressions on arm-linux-gnueabihf --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvcsf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 gcc.target/arm/vselvsdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvssf.c scan-assembler-times vselvs.f32\ts[0-9]+ 1 Maybe the upcoming patch from Segher intends to address this? Thanks, Christophe > Segher > > >> PR rtl-optimization/71724 >> * combine.c (if_then_else_cond): Look for situations where it is >> beneficial to undo the work of one of the recursive calls.
[committed] Enable g++.dg/asan/asan_test.C test also on non-x86 targets
Hi! The target requirement in the test was meant to prevent running the test on x86 without sse2 support, but unfortunately disabled the test also on non-x86. The following test fixes that, regtested on {aarch64,x86_64,powerpc64,powerpc64le,i686,s390x}-linux, committed to trunk as obvious. The test only fails (on all arches, but on most targets just SimpleStackTest with kSize + 31) when testing with --target_board=unix/-fstack-protector-strong, without SSP it succeeds everywhere (including s390x with the asan enablement I've posted). 2017-01-24 Jakub Jelinek* g++.dg/asan/asan_test.C: Enable on all *-*-linux* targets that support asan, only on i?86/x86_64 require sse2_runtime. --- gcc/testsuite/g++.dg/asan/asan_test.C.jj2016-11-10 18:03:28.0 +0100 +++ gcc/testsuite/g++.dg/asan/asan_test.C 2017-01-23 21:47:04.941131367 +0100 @@ -1,4 +1,4 @@ -// { dg-do run { target { { i?86-*-linux* x86_64-*-linux* } && sse2_runtime } } } +// { dg-do run { target { { *-*-linux* } && { { ! { i?86-*-linux* x86_64-*-linux* } } || sse2_runtime } } } } // { dg-skip-if "" { *-*-* } { "*" } { "-O2" } } // { dg-skip-if "" { *-*-* } { "-flto" } { "" } } // { dg-additional-sources "asan_globals_test-wrapper.cc" } Jakub