Re: [C++ PATCH] Improve location of CALL_EXPRs (PR c++/60862)

2014-09-12 Thread Jason Merrill

On 09/12/2014 04:52 PM, Marek Polacek wrote:

+   protected_set_expr_location (postfix_expression, loc);


Let's use the location of the (, which should just be token->location at 
this point.  So column 17 instead of 13 in the new test.  OK with that 
change.


In some cases postfix_expression won't be a CALL_EXPR at this point; it 
might be a TARGET_EXPR or an INDIRECT_REF.  But I suppose setting the 
location on whatever it happens to be will work well enough for now.


Jason



Re: [PATCH] Work around miscompilation with 4.8.1

2014-09-12 Thread H.J. Lu
On Fri, Sep 12, 2014 at 3:27 PM, Andi Kleen  wrote:
> From: Andi Kleen 
>
> When compiling on opensuse 13.1, with a 4.8.1 based host compiler
> and --disable-bootstrap, the generated compiler always ICEs while
> compiling __builtin_cpu_supports in the cilk runtime library.
>
> The problem is fixed with later 4.8 releases, but at least still
> happens with the opensuse compiler.
>
> The cilk library already had a fallback path for this function
> for other compilers. Just use the fallback path when
> __SSE2_MATH__ is set. This makes it work on x86_64 systems
> with the buggy 4.8.1 at least, if multilib is forced to
> SSE.
>
> libcilkrts/:
>
> 2014-09-12  Andi Kleen  
>
> PR bootstrap/63235
> * runtime/config/x86/os-unix-sysdep.c (__builtin_cpu_supports):
> Use fallback when __SSE2_MATH__ is set.
> ---
>  libcilkrts/runtime/config/x86/os-unix-sysdep.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libcilkrts/runtime/config/x86/os-unix-sysdep.c 
> b/libcilkrts/runtime/config/x86/os-unix-sysdep.c
> index b505ddf..344c31a 100644
> --- a/libcilkrts/runtime/config/x86/os-unix-sysdep.c
> +++ b/libcilkrts/runtime/config/x86/os-unix-sysdep.c
> @@ -96,7 +96,7 @@ COMMON_SYSDEP int __cilkrts_xchg(volatile int *ptr, int x)
>   * This declaration should generate an error when the Intel compiler adds
>   * supprt for the intrinsic.
>   */
> -#ifdef __INTEL_COMPILER
> +#if defined(__INTEL_COMPILER) || defined(__SSE2_MATH__)
>  static inline int __builtin_cpu_supports(const char *feature)
>  {
>  return 1;

So GCC 4.8.1 miscompiles GCC.  Can we trust such GCC?
Should we put a work around in GCC compiler to avoid the
bad GCC?

BTW,  I run into another miscompilation bug

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63252

It only shows up when GCC is compiled with -O2 using GCC 5
on x32.

-- 
H.J.


Re: [PATCH] gcc parallel make check

2014-09-12 Thread Jakub Jelinek
On Fri, Sep 12, 2014 at 04:42:25PM -0700, Mike Stump wrote:
> curious, when I run atomic.exp=stdatom\*.c:
> 
>   gcc.dg/atomic/atomic.exp completed in 30 seconds.
> 
> atomic.exp=c\*.c takes 522 seconds with 3, 2, 5 and 4 being the worst 
> offenders.

That's the
@if [ -z "$(filter-out --target_board=%,$(filter-out 
--extra_opts%,$(RUNTESTFLAGS)))" ] \
&& [ "$(filter -j, $(MFLAGS))" = "-j" ]; then \
i.e. if you specify anything in RUNTESTFLAGS other than --target_board= or
--extra_opts, it is not parallelized.  This was done previously because
parallelization required setting the flags to something different (manually
created *.exp list).  The first [] could perhaps be removed now, if one e.g.
RUNTESTFLAGS=atomic.exp etc. with sufficiently enough tests, parallelization
will be still worth it.  I've been worried about the quick cases where
parallelization is not beneficial, like make check-gcc \
RUNTESTFLAGS=dg.exp=pr60123.c or similar, but one doesn't usually pass -jN
in that case.  So yes, the
[ -z "$(filter-out --target_board=%,$(filter-out 
--extra_opts%,$(RUNTESTFLAGS)))" ]
can be dropped (not in libstdc++ though, there are abi.exp and
prettyprinters.exp still run serially, though even that could be handled the
struct-layout-1.exp way, of running it by the first instance to encounter
those with small changes in those *.exp files).

Jakub


Re: [PATCH] gcc parallel make check

2014-09-12 Thread Mike Stump
On Sep 12, 2014, at 9:32 AM, Jakub Jelinek  wrote:
> Here is my latest version of the patch.

I did a timing test:

Before:

real0m57.198s
user1m24.736s
sys 0m19.816s

after:

real0m28.224s
user1m27.823s
sys 0m22.374s

This is a -j70 run on a 64 core power7 of check-objc, I picked an obscure test 
case that I had no reason to believe was other than ignored and certainly not 
engineered for and kinda small to ensure the overhead would penalize it…  
50.66% faster.  There is still room for improvement:

$ vmstat 1
procs ---memory-- ---swap-- -io -system-- -cpu--
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa st
 7  0  0 99046848 8515072 1674867200 0 011  0  0 
100  0  0
 7  0  0 99050432 8515072 1674873600 0 0 7501 9022 13  3 84 
 0  0
 7  0  0 99029376 8515072 1674924800 0 0 7320 8777 10  2 88 
 0  0
 7  0  0 99070656 8515072 1674944000 0  1524 7162 8156  9  2 88 
 1  0
 7  0  0 99034560 8515072 1674982400 0 0 8096 10363  7  2 
91  0  0
 7  0  0 99030080 8515072 1675072000 0 0 8798 11673  8  3 
90  0  0
 9  0  0 99037376 8515072 1675008000 0 0 9151 12598  9  3 
87  0  0
 7  0  0 99024128 8515136 1675065600 0 0 9078 13168  7  3 
90  0  0
10  0  0 99034496 8515136 1675148800 0  1800 8633 11675  8  3 
88  1  0
 8  0  0 98986304 8515136 1675129600 0 0 10159 14553  7  3 
90  0  0
 7  0  0 99010112 8515520 1676582400 0 0 8814 12036 10  3 
87  0  0
 4  0  0 99014016 8515648 1677356800 0 0 8091 10445  8  3 
90  0  0
 4  0  0 99064832 8515712 1677312000 0 0 5416 5071  9  2 89 
 0  0
 3  0  0 99118976 8515712 1677318400 0 12716 4743 3533  4  1 92 
 2  0
 3  0  0 99077504 8515840 1677324800 0 0 4525 3988  3  1 96 
 0  0
 2  0  0 99121152 8515840 1677382400 0 0 4687 3757  3  1 97 
 0  0
 2  0  0 99117056 8515840 1677363200 0 0 4334 3156  3  1 96 
 0  0
 2  0  0 99105728 8515840 1677433600 0 0 4355 3246  3  1 96 
 0  0
 3  0  0 99069120 8515904 1677363200 0   648 4902 4037  2  1 97 
 0  0
 1  0  0 99153664 8515968 1677459200 0 0 3776 2711  2  1 97 
 0  0
 1  0  0 99151232 8515968 1677440000 0 0  877  205  4  0 96 
 0  0
 1  0  0 99151424 8516032 1677452800 0   236  774  466  2  0 97 
 0  0
 2  0  0 99148032 8516032 1677465600 0 0  853  350  2  0 98 
 0  0
 2  0  0 99146176 8516032 1677465600 0  1208 1630 1363  1  0 99 
 0  0
 1  0  0 99156032 8516352 1677715200 0 0 1919 2104  1  0 99 
 0  0
 0  0  0 99189376 8516416 1677651200 0 0 1181  799  2  0 98 
 0  0
 0  0  0 99189312 8516416 1677651200 0 0  118   18  0  0 
100  0  0
 0  0  0 99189312 8516416 1677651200 0 0   90   18  0  0 
100  0  0
 0  0  0 99187968 8516416 1677651200 0  5468  196   42  0  0 
100  0  0
 0  0  0 99187968 8516416 1677651200 0 0   92   24  0  0 
100  0  0
 0  0  0 99188032 8516416 1677651200 0 0  146   37  0  0 
100  0  0
 0  0  0 99188160 8516416 1677651200 0   128   91   36  0  0 
100  0  0
 1  0  0 99188160 8516416 1677651200 0 0   74   16  0  0 
100  0  0
 0  0  0 99188160 8516416 1677651200 0 0   72   20  0  0 
100  0  0
 0  0  0 99188224 8516416 1677651200 0 0   76   22  0  0 
100  0  0
 0  0  0 99188224 8516416 1677651200 0 0  118   29  0  0 
100  0  0

which averages to 95% idle.  I changed:

check_objc_parallelize = 6

to 

check_objc_parallelize = 70

to try and get it to go faster:

real0m21.252s
user3m21.035s
sys 1m9.937s

:-(  7 seconds (24.6%) faster, but consumes 146% more resources to see the 
benefit.

with the filesystem update to 2 (instead of 10):

real0m22.478s
user4m38.564s
sys 1m25.293s

and filesystem update 5:

real0m21.665s
user3m51.615s
sys 1m16.005s

and filesystem update 20:

real0m22.681s
user3m2.746s
sys 1m5.576s

a -j1 filesystem update 20 for comparison:

real1m48.127s
user1m17.953s
sys 0m17.191s

a -j1 check_objc_parallelize 6 filesystem update 10 for comparison:

real1m47.552s
user1m17.410s
sys 0m16.909s

a -j70 check_objc_parallelize 1 filesystem update 10 for comparison:

real0m21.292s
user3m17.368s
sys 1m10.106s

a -j70 check_objc_parallelize 1 filesystem update 2 for comparison:

real0m21.976s
user4m37.600s
sys 1m26.598s

a -j70 check_objc_parallelize 1 filesystem update 200 for comparison:

real1m12.319s
user2m49.975s
sys 1

[GOOGLE] Fix LIPO COMDAT fixup and gcov-tool interactions

2014-09-12 Thread Teresa Johnson
This patch addresses issues when running gcov-tool after performing
COMDAT fixup during dyn-ipa. Functions that were previously all zero
counts are marked, and the counts are discarded when being read in
by gcov-tool before recalculating module groups and summary info.

While here, cleaned up the gcov-tool output (remove an overly-verbose output,
make all output consistently go to stderr).

Passes regression tests and manual tests. Ok for google branches?

2014-09-12  Teresa Johnson  

* gcc/coverage.c (read_counts_file): Handle new section.
* gcc/gcov.c (read_count_file): Ditto.
* gcc/gcov-dump.c (dump_gcov_file): Ditto.
(tag_function): Ditto.
(tag_zero_fixup): New function.
* gcc/gcov-io.c (gcov_read_comdat_zero_fixup): Ditto.
* gcc/gcov-io.h (gcov_read_comdat_zero_fixup): Ditto.
* libgcc/dyn-ipa.c (struct checksum_alias): Change flag to pointer.
(new_checksum_alias): Ditto.
(cfg_checksum_insert): Ditto.
(checksum_set_insert): Ditto.
(gcov_build_callgraph): New parameter.
(gcov_collect_imported_modules): Add assert for duplicate gcda reads.
(gcov_fixup_counters_checksum): Change flag to pointer to flag, set it.
(__gcov_compute_module_groups): New parameter.
* libgcc/libgcov-driver.c (set_gcov_fn_fixed_up): New function.
(get_gcov_fn_fixed_up): Ditto.
(gcov_exit_merge_gcda): Handle new section.
(gcov_write_comdat_zero_fixup): Ditto.
(gcov_write_build_info): Ditto.
(gcov_write_comdat_zero_fixup): New function.
(gcov_write_func_counters): Fix indent.
(gcov_dump_module_info): Write new flag section.
* libgcc/libgcov.h (gcov_get_counter): Clear fixed-up counters.
(gcov_get_counter_target): Ditto.
* libgcc/libgcov-util.c (tag_function): Annotate fixed-up functions,
remove overly verbose output.
(tag_counters): Clear fixed-up counters.
(lipo_process_substitute_string_1): Send all verbose output to stderr.
(tag_zero_fixup): New function.
(read_gcda_file): Deallocate flag array.
(gcov_profile_scale): Send all verbose output to stderr.
(gcov_profile_normalize): Ditto.

Index: gcc/coverage.c
===
--- gcc/coverage.c  (revision 215230)
+++ gcc/coverage.c  (working copy)
@@ -820,6 +820,14 @@ read_counts_file (const char *da_file_name, unsign
 free (build_info_strings[i]);
   free (build_info_strings);
 }
+  else if (tag == GCOV_TAG_COMDAT_ZERO_FIXUP)
+{
+  /* Zero-profile fixup flags are not used by the compiler, read and
+ ignore.  */
+  gcov_unsigned_t num_fn;
+  int *zero_fixup_flags = gcov_read_comdat_zero_fixup
(length, &num_fn);
+  free (zero_fixup_flags);
+}
   else if (GCOV_TAG_IS_COUNTER (tag) && fn_ident)
{
  counts_entry_t **slot, *entry, elt;
Index: gcc/gcov.c
===
--- gcc/gcov.c  (revision 215230)
+++ gcc/gcov.c  (working copy)
@@ -1441,6 +1441,12 @@ read_count_file (function_t *fns)
 free (build_info_strings[i]);
   free (build_info_strings);
 }
+  else if (tag == GCOV_TAG_COMDAT_ZERO_FIXUP)
+{
+  gcov_unsigned_t num_fn;
+  int *zero_fixup_flags = gcov_read_comdat_zero_fixup
(length, &num_fn);
+  free (zero_fixup_flags);
+}
   else if (tag == GCOV_TAG_FUNCTION && !length)
; /* placeholder  */
   else if (tag == GCOV_TAG_FUNCTION && length == GCOV_TAG_FUNCTION_LENGTH)
Index: gcc/gcov-dump.c
===
--- gcc/gcov-dump.c (revision 215230)
+++ gcc/gcov-dump.c (working copy)
@@ -42,6 +42,7 @@ static void tag_summary (const char *, unsigned, u
 static void tag_module_info (const char *, unsigned, unsigned);
 static void dump_working_sets (const char *filename ATTRIBUTE_UNUSED,
const struct gcov_ctr_summary *summary);
+static void tag_zero_fixup (const char *, unsigned, unsigned);
 static void tag_build_info (const char *, unsigned, unsigned);
 extern int main (int, char **);

@@ -57,6 +58,9 @@ static int flag_dump_positions = 0;
 static int flag_dump_aux_modules_only = 0;
 static int flag_dump_working_sets = 0;

+static unsigned num_fn_info;
+static int *zero_fixup_flags = NULL;
+
 static const struct option options[] =
 {
   { "help", no_argument,   NULL, 'h' },
@@ -79,6 +83,7 @@ static const tag_format_t tag_table[] =
   {GCOV_TAG_OBJECT_SUMMARY, "OBJECT_SUMMARY", tag_summary},
   {GCOV_TAG_PROGRAM_SUMMARY, "PROGRAM_SUMMARY", tag_summary},
   {GCOV_TAG_MODULE_INFO, "MODULE INFO", tag_module_info},
+  {GCOV_TAG_COMDAT_ZERO_FIXUP, "ZERO FIXUP", tag_zero_fixup},
   {GCOV_TAG_BUILD_INFO, "BUILD INFO", tag_bui

Re: [Ping] Port of VTV for Cygwin and MinGW

2014-09-12 Thread Caroline Tice
First attempt to send this failed.

On Fri, Sep 12, 2014 at 3:41 PM, Caroline Tice  wrote:
>
> Hi Patrick,
>
> Mostly your patch looks OK to me, though there are a couple of serious issues 
> (mentioned below).  Most of my comments are for formatting stuff.   Once you 
> have fixed these issues, let me know and I'll look at it again.  But someone 
> else will still have to approve the parts of this patch that are outside the 
> libvtv directory (I believe).
>
> -- Caroline Tice
> cmt...@google.com
>
>
> In changes to gcc/config/i386/cygwin.h  mingw-w64.h and mingw32.h, you forgot 
> to handle the "fvtable-verify=preinit" options.  fvtable-veriy=preinit should 
> cause vtv_start_preinit.o to be added to the STARTFILE_SPEC and 
> vtv_end_preinit.o to be added to the ENDFILE_SPEC (as in  
> gcc/config/gnu-user.h).  I expect you will also need to add it to your 
> LIB_SPEC definitions in those config files.
>
> in libgcc/config.host, the indentation looks wrong on the line 660 (where you 
> add the extra parts for vtable verification for the case 
> "i[34567]86-*-mingw*)". It also looks wrong on line 709 (again, adding extra 
> parts, for case "x86_64-*-mingw*)".  The rest of the changes in that file 
> look ok, but someone else will need to approve them.
>
> The changes in libgcc/Makefile.in and gcc/varasm.c look ok to me, but someone 
> will will have to approve them since I don't have authority to approve 
> changes there.
>
> in libstdc++-v3/libsupc++:
>
> Your change in Makefile.am looks ok to me; your changes in vtv_stubs.cc look 
> ok, except that you appear to have messed up the indentations in the function 
> headers (both for the declarations and the actual functions).  Also there is 
> a typo in your comment:  'build' should be 'built'.  But content-wise, the 
> change looks fine.  Again, someone else will have to actually approve these 
> changes since I do not have that authority.
>
> in libstdc++-v3/src/Makefile.am also looks ok to me; someone else will have 
> to give final approval.
>
>
> in libvtv/Makefile.am, you need to fix the indentation at line 64 
> (vtv_stubs.cc):
>
> vtv_stubs_sources = \
> vtv_start.c \
> vtv_stubs.cc \
> vtv_end.c
>
> the rest of the changes in that file look good.
>
> Why did you make a copy of obstack.c in libvtv rather than using the one in 
> libiberty?  It would be better not to make a second copy of the source file 
> if that can be avoided...
>
> in libvtv/vtv_malloc.cc:
>
> lines 207-213:  Fix the indentation on the second line of call to 
> VirtualAlloc.
> #if defined (__CYGWIN__) || defined (__MINGW32__)
>   if ((allocated = VirtualAlloc(NULL, size,  MEM_RESERVE|MEM_COMMIT,
> PAGE_READWRITE)) == 0)
> #else
>   if ((allocated = mmap (NULL, size, PROT_READ | PROT_WRITE,"
>  MAP_PRIVATE | MAP_ANONYMOUS,  -1, 0)) == 0)
> #endif
>
> Remove extra blank line at line 216.
>
> in libvtv/vtv_rts.cc:
>
> Your version of the function read_section_offset_and_length has several lines 
> that exceed the 80 character limit; please fix that.  Your function 
> iterate_modules  also has one or two lines that are too long, and it needs a 
> function comment.
>
> in libvtv/vtv_utils.cc:
>
> In the function __vtv_open_log, you seem to have the following code twice:
>
> #ifdef __MINGW32__
>   mkdir (logs_prefix);
> #else
>   mkdir (logs_prefix, S_IRWXU);
> #endif
>
> was there a reason for this, or is this an accident (in which case the second 
> occurrence should be removed)?
>
>
>
> On Wed, Sep 10, 2014 at 9:12 PM, Patrick Wollgast  
> wrote:
>>
>> Ping for https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02559.html
>>
>> Also added Caroline Tice, as libvtv maintainer, to cc and attached
>> virtual_func_test_min_UAF.cpp, which I forgot in the original mail.
>>
>> Patrick
>>
>> On 28.08.2014 13:03, Patrick Wollgast wrote:
>> > This patch contains a port of VTV -fvtable-verify=std for Cygwin and MinGW.
>> >
>> > Since weak symbols on Windows and Linux are implemented differently, and
>> > VTV should have the possibility to be switched on and off, the structure
>> > of the feature had to be modified.
>> > On Linux libstdc++ contains the weak stub functions of VTV. For Cygwin
>> > and MinGW they have been removed, due to the difference of weak symbols.
>> > On Linux and on Windows libstdc++ itself gets build with
>> > -fvtable-verify=std. Since libvtv gets build after libstdc++, and
>> > libstdc++ doesn't contain the stub functions any more, 'undefined
>> > reference' errors are thrown during linking of libstdc++. To prevent
>> > these errors during the linking process a libvtv-0.dll gets build from
>> > the stub functions before libstdc++-6.dll is linked.
>> > At the end of the build process two VTV dlls have been build. One is
>> > called libvtv-0.dll, containing the real functions, the other is called
>> > libvtv_stubs-0.dll, containing the stub functions. Depending on whether
>> > libvtv-0.dll is first

[PATCH] Work around miscompilation with 4.8.1

2014-09-12 Thread Andi Kleen
From: Andi Kleen 

When compiling on opensuse 13.1, with a 4.8.1 based host compiler
and --disable-bootstrap, the generated compiler always ICEs while
compiling __builtin_cpu_supports in the cilk runtime library.

The problem is fixed with later 4.8 releases, but at least still
happens with the opensuse compiler.

The cilk library already had a fallback path for this function
for other compilers. Just use the fallback path when
__SSE2_MATH__ is set. This makes it work on x86_64 systems
with the buggy 4.8.1 at least, if multilib is forced to
SSE.

libcilkrts/:

2014-09-12  Andi Kleen  

PR bootstrap/63235
* runtime/config/x86/os-unix-sysdep.c (__builtin_cpu_supports):
Use fallback when __SSE2_MATH__ is set.
---
 libcilkrts/runtime/config/x86/os-unix-sysdep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libcilkrts/runtime/config/x86/os-unix-sysdep.c 
b/libcilkrts/runtime/config/x86/os-unix-sysdep.c
index b505ddf..344c31a 100644
--- a/libcilkrts/runtime/config/x86/os-unix-sysdep.c
+++ b/libcilkrts/runtime/config/x86/os-unix-sysdep.c
@@ -96,7 +96,7 @@ COMMON_SYSDEP int __cilkrts_xchg(volatile int *ptr, int x)
  * This declaration should generate an error when the Intel compiler adds
  * supprt for the intrinsic.
  */
-#ifdef __INTEL_COMPILER
+#if defined(__INTEL_COMPILER) || defined(__SSE2_MATH__)
 static inline int __builtin_cpu_supports(const char *feature)
 {
 return 1;
-- 
2.1.0



Re: Remove LIBGCC2_HAS_?F_MODE target macros

2014-09-12 Thread Joseph S. Myers
On Fri, 12 Sep 2014, paul_kon...@dell.com wrote:

> > * SFmode would always have been supported in libgcc (the condition was
> >  BITS_PER_UNIT == 8, true for all current targets), but pdp11
> >  defaults to 64-bit float, and in that case SFmode would fail
> >  scalar_mode_supported_p.  I don't know if libgcc actually built for
> >  pdp11 (and the port may well no longer be being used), but this
> >  patch adds a scalar_mode_supported_p hook to it to ensure SFmode is
> >  treated as supported.
> 
> I thought it does build.  I continue to work to keep that port alive.
> 
> The change looks fine.
> 
> The ideal solution, I think, would be to handle the choice of float 
> length that the pdp11 target has via the multilib machinery.  Currently 
> it does not do that.  If multilib were added for that at some point, 
> would that require a change of the code in that hook?

I think the ideal is for the back end to accept a mode in 
scalar_mode_supported_p if it can generate something sensible (either 
inline code or calls to libgcc functions) for arithmetic on that mode, 
rather than ICEs or otherwise invalid code, even if the libgcc functions 
don't actually exist.  (Thus, ix86_scalar_mode_supported_p always 
considers TFmode to be supported, whether or not the libgcc support is 
present.)

On that basis, my hook to treat SFmode as always supported for pdp11 (so 
it can be accessed with __attribute__((mode(SF))), whether or not it's 
also float) seems to be the right thing.

(Various back ends would, if they adopted my ideal, then also need to add 
the libgcc_floating_mode_supported_p hook to indicate the conditional lack 
of libgcc support for certain modes.  E.g. for several back ends, TFmode 
is only supported in libgcc if it's long double, and most of the runtime 
support is expected to be in libc not libgcc, under symbol names from some 
ABI for that architecture.  In those cases, building in the libgcc support 
for e.g. __multc3 in the absence of libc support would be problematic, 
because it would reference undefined libc functions.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 2/2] add static typed insn_deleted_p

2014-09-12 Thread Jeff Law

On 09/11/14 16:49, tsaund...@mozilla.com wrote:

From: Trevor Saunders 

Hi,

This changes all callers of INSN_DELETED_P to use one of insn->deleted () 
insn->set_deleted () or insn->set_undeleted () depending on what they're doing 
(set_deleted / set_undeleted seem somewhat clearer to me than = 0 / 1).

bootstrapped + regtested on x86_64-unknown-linux-gnu, and run through
config-list.mk with a couple other patches. ok?

Trev

gcc/

* cfgrtl.c, combine.c, config/arc/arc.c, config/mcore/mcore.c,
config/rs6000/rs6000.c, config/sh/sh.c, cprop.c, dwarf2out.c,
emit-rtl.c, final.c, function.c, gcse.c, jump.c, reg-stack.c,
reload1.c, reorg.c, resource.c, sel-sched-ir.c: Replace INSN_DELETED_P
macro with staticly checked member functions.
* rtl.h (rtx_insn::deleted): New method.
(rtx_insn::set_deleted): Likewise.
(rtx_insn::set_undeleted): Likewise.

OK with the ChangeLog fixups pointed out by David.

Jeff



Re: [PATCH 1/2] use rtx_insn * more

2014-09-12 Thread Jeff Law

On 09/11/14 16:48, tsaund...@mozilla.com wrote:

From: Trevor Saunders 

Hi,

pretty trivial, but not quiet just changing the types of variables.
bootstrapped + regtested on x86_64-unknown-linux-gnu, and run through
config-list.mk with a couple other patches. ok?

Trev

gcc/ChangeLog:

2014-09-11  Trevor Saunders  

* config/mn10300/mn10300.c (mn10300_insert_setlb_lcc): Assign the
result of emit_jump_insn_before to a new variable.
* jump.c (mark_jump_label): Change the type of insn to rtx_insn *.
(mark_jump_label_1): Likewise.
(mark_jump_label_asm): Likewise.
* reload1.c (gen_reload): Change type of tem to rtx_insn *.
* rtl.h (mark_jump_label): Adjust.

OK.
Jeff



Re: [PATCH 2/2] add static typed insn_deleted_p

2014-09-12 Thread Jeff Law

On 09/11/14 19:17, Trevor Saunders wrote:

On Thu, Sep 11, 2014 at 08:06:02PM -0400, David Malcolm wrote:

On Thu, 2014-09-11 at 18:49 -0400, tsaund...@mozilla.com wrote:

From: Trevor Saunders 

Hi,

This changes all callers of INSN_DELETED_P to use one of insn->deleted () 
insn->set_deleted () or insn->set_undeleted () depending on what they're doing 
(set_deleted / set_undeleted seem somewhat clearer to me than = 0 / 1).

bootstrapped + regtested on x86_64-unknown-linux-gnu, and run through
config-list.mk with a couple other patches. ok?

Trev

gcc/

* cfgrtl.c, combine.c, config/arc/arc.c, config/mcore/mcore.c,
config/rs6000/rs6000.c, config/sh/sh.c, cprop.c, dwarf2out.c,
emit-rtl.c, final.c, function.c, gcse.c, jump.c, reg-stack.c,
reload1.c, reorg.c, resource.c, sel-sched-ir.c: Replace INSN_DELETED_P
macro with staticly checked member functions.
* rtl.h (rtx_insn::deleted): New method.
(rtx_insn::set_deleted): Likewise.
(rtx_insn::set_undeleted): Likewise.


I'm not an approver, but a couple of nitpicks:
(A) "staticly" -> "statically"
(B) the above candidate ChangeLog for rtl.h omits the deletion of the
INSN_DELETED_P macro (obviously trivial to fix).


oops, I see there's at least one person who actually reads changelogs.

I read them as well, though not for every change.   :-)

Jeff


Re: [PATCH 4/4] Instruction attributes take an rtx_insn *

2014-09-12 Thread Jeff Law

On 09/12/14 14:38, David Malcolm wrote:

This patch strengthens the params of the all of the various generated
get_attr_* functions from rtx to rtx_insn *, along with various other
functions relating to instruction attributes and scheduling.

As well as the changes to genattr.c, genattrtab.c and genautomata.c, the
bulk of the patch makes the small adjustments needed to the various
arch-specific config subdirectories to support these changes; basically
anywhere that calls a "get_attr_" function with something that wasn't
already an rtx_insn *.

A subtlety occurs in genautomata.c: I wasn't able to strengthen all of the
generated functions due to the possibility of some of these receiving
const0_rtx.

In particular, note that the result of:
   targetm.sched.dfa_pre_cycle_insn
is *not* always an insn: c6x's implementation:

   6796  #undef TARGET_SCHED_DFA_PRE_CYCLE_INSN
   6797  #define TARGET_SCHED_DFA_PRE_CYCLE_INSN c6x_sched_dfa_pre_cycle_insn

returns a const0_rtx:

   3953  /* Used together with the collapse_ndfa option, this ensures that we 
reach a
   3954 deterministic automaton state before trying to advance a cycle.
   3955 With collapse_ndfa, genautomata creates advance cycle arcs only for
   3956 such deterministic states.  */
   3957
   3958  static rtx
   3959  c6x_sched_dfa_pre_cycle_insn (void)
   3960  {
   3961return const0_rtx;
   3962  }

In genautomata.c: output_internal_insn_code_evaluation (when
"collapse_flag" is set), we write out code that can handle const0_rtx
as well as an insn.

The function "output_internal_insn_code_evaluation" is used when writing
out the following generated functions:
   state_transition
   min_insn_conflict_delay
   insn_latency
   maximal_insn_latency
and the fact that "insn_latency" could receive a const0_rtx also
thus also affects "internal_insn_latency".

Given that these functions can receive a const0_rtx as well as an insn
chain node, I *didn't* change their params, and instead add a checked cast
to rtx_insn * once we can prove that we're not dealing with a const0_rtx.

gcc/ChangeLog:
* config/arc/arc-protos.h (arc_attr_type): Strengthen param from
rtx to rtx_insn *.
(arc_sets_cc_p): Likewise.
* config/arc/arc.c (arc_print_operand): Use methods of
"final_sequence" for clarity, and to enable strengthening of
locals "jump" and "delay" from rtx to rtx_insn *.
(arc_adjust_insn_length): Strengthen local "prev" from rtx to
rtx_insn *; use method of rtx_sequence for typesafety.
(arc_get_insn_variants): Use insn method of rtx_sequence for
typesafety.
(arc_pad_return): Likewise.
(arc_attr_type): Strengthen param from rtx to rtx_insn *.
(arc_sets_cc_p): Likewise.  Also, convert a GET_CODE check to a
dyn_cast to rtx_sequence *, using insn method for typesafety.
* config/arc/arc.h (ADJUST_INSN_LENGTH): Add checked casts to
rtx_sequence * and use insn method when invoking get_attr_length.
* config/bfin/bfin.c (type_for_anomaly): Strengthen param from rtx
to rtx_insn *.  Replace a GET_CODE check with a dyn_cast to
rtx_sequence *, introducing a local "seq", using its insn method
from typesafety and clarity.
(add_sched_insns_for_speculation): Strengthen local "next" from
rtx to rtx_insn *.
* config/c6x/c6x.c (get_insn_side): Likewise for param "insn".
(predicate_insn): Likewise.
* config/cris/cris-protos.h (cris_notice_update_cc): Likewise for
second param.
* config/cris/cris.c (cris_notice_update_cc): Likewise.
* config/epiphany/epiphany-protos.h
(extern void epiphany_insert_mode_switch_use): Likewise for param
"insn".
(get_attr_sched_use_fpu): Likewise for param.
* config/epiphany/epiphany.c (epiphany_insert_mode_switch_use):
Likewise for param "insn".
* config/epiphany/mode-switch-use.c (insert_uses): Likewise for
param "insn" of "target_insert_mode_switch_use" callback.
* config/frv/frv.c (frv_insn_unit): Likewise for param "insn".
(frv_issues_to_branch_unit_p): Likewise.
(frv_pack_insn_p): Likewise.
(frv_compare_insns): Strengthen locals "insn1" and "insn2" from
const rtx * (i.e. mutable rtx_def * const *) to
rtx_insn * const *.
* config/i386/i386-protos.h (standard_sse_constant_opcode):
Strengthen first param from rtx to rtx_insn *.
(output_fix_trunc): Likewise.
* config/i386/i386.c (standard_sse_constant_opcode): Likewise.
(output_fix_trunc): Likewise.
(core2i7_first_cycle_multipass_filter_ready_try): Likewise for
local "insn".
(min_insn_size): Likewise for param "insn".
(get_mem_group): Likewise.
(is_cmp): Likewise.
(get_insn_path): Likewise.
(get_insn_group): Likewise.
(count_num_restricted): Likewise.
(fits_dis

Re: [PATCH 3/4] The various TARGET_ASM_..._MAX_SKIP hooks take an insn

2014-09-12 Thread Jeff Law

On 09/12/14 14:38, David Malcolm wrote:

gcc/ChangeLog:
* config/rs6000/rs6000.c (rs6000_loop_align_max_skip): Strengthen
param "label" from rtx to rtx_insn *.
* config/rx/rx.c (rx_max_skip_for_label): Likewise for param "lab"
and local "op".
* doc/tm.texi (TARGET_ASM_JUMP_ALIGN_MAX_SKIP): Autogenerated changes.
(TARGET_ASM_LABEL_ALIGN_AFTER_BARRIER_MAX_SKIP): Likewise.
(TARGET_ASM_LOOP_ALIGN_MAX_SKIP): Likewise.
(TARGET_ASM_LABEL_ALIGN_MAX_SKIP): Likewise.
* final.c (default_label_align_after_barrier_max_skip): Strengthen
param from rtx to rtx_insn *.
(default_loop_align_max_skip): Likewise.
(default_label_align_max_skip): Likewise.
(default_jump_align_max_skip): Likewise.
* target.def (label_align_after_barrier_max_skip): Likewise.
(loop_align_max_skip): Likewise.
(label_align_max_skip): Likewise.
(jump_align_max_skip): Likewise.
* targhooks.h (default_label_align_after_barrier_max_skip):
Likewise.
(default_loop_align_max_skip): Likewise.
(default_label_align_max_skip): Likewise.
(default_jump_align_max_skip): Likewise.

OK.
Jeff


Re: [PATCH 2/4] The TARGET_CAN_FOLLOW_JUMP hook takes insns

2014-09-12 Thread Jeff Law

On 09/12/14 14:38, David Malcolm wrote:

gcc/ChangeLog:
* config/arc/arc.c (arc_can_follow_jump): Strengthen both params
from const_rtx to const rtx_insn *.  Update union members from rtx
to rtx_insn *.
* doc/tm.texi (TARGET_CAN_FOLLOW_JUMP): Autogenerated change.
* hooks.c (hook_bool_const_rtx_const_rtx_true): Rename to...
(hook_bool_const_rtx_insn_const_rtx_insn_true): ...this, and
strengthen both params from const_rtx to const rtx_insn *.
* hooks.h (hook_bool_const_rtx_const_rtx_true): Likewise.
(hook_bool_const_rtx_insn_const_rtx_insn_true): Likewise.
* reorg.c (follow_jumps): Strengthen param "jump" from rtx to
rtx_insn *.
* target.def (can_follow_jump): Strengthen both params from
const_rtx to const rtx_insn *, and update default implementation
from hook_bool_const_rtx_const_rtx_true to
hook_bool_const_rtx_insn_const_rtx_insn_true.

OK.
Jeff



Re: [PATCH 1/4] deps_start_bb takes an insn

2014-09-12 Thread Jeff Law

On 09/12/14 14:38, David Malcolm wrote:

gcc/ChangeLog:
* sched-deps.c (deps_start_bb): Strengthen param "head" and local
"insn" from rtx to rtx_insn *.
* sched-int.h (deps_start_bb): Likewise for 2nd param.

OK.
jeff



Re: [PATCH] Tiny cleanup for protected_set_expr_location

2014-09-12 Thread Jeff Law

On 09/12/14 14:52, Marek Polacek wrote:

This is rather obvious.  CAN_HAVE_LOCATION_P checks that the node is
non-null, so no need to check for it in protected_set_expr_location too.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-09-12  Marek Polacek  

* tree.c (protected_set_expr_location): Don't check whether T is
non-null here.

OK.
Jeff



Re: RFA: Add a destructor to target_ira_int

2014-09-12 Thread Jeff Law

On 09/12/14 01:25, Richard Sandiford wrote:

Jeff Law  writes:

On 09/08/14 09:26, Richard Sandiford wrote:

This patch adds a destructor to target_ira_int, so that the data structures
it points to are freed when the parent target_globals is freed.  It fixes
a memory leak with non-default subtargets.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* ira.h (ira_finish_once): Delete.
* ira-int.h (target_ira_int::~target_ira_int): Declare.
(target_ira_int::free_ira_costs): Likewise.
(target_ira_int::free_register_move_costs): Likewise.
(ira_finish_costs_once): Delete.
* ira.c (free_register_move_costs): Replace with...
(target_ira_int::free_register_move_costs): ...this new function.
(target_ira_int::~target_ira_int): Define.
(ira_init): Call free_register_move_costs as a member function rather
than a global function.
(ira_finish_once): Delete.
* ira-costs.c (free_ira_costs): Replace with...
(target_ira_int::free_ira_costs): ...this new function.
(ira_init_costs): Call free_ira_costs as a member function rather
than a global function.
(ira_finish_costs_once): Delete.
* target-globals.c (target_globals::~target_globals): Call the
target_ira_int destructor.
* toplev.c: Include lra.h.
(finalize): Call lra_finish_once rather than ira_finish_once.

Consider making target_ira_int a class.  OK for the trunk.

jeff


I'd prefer to keep it as a struct if that's OK.  At the moment these
target structures are just collections of variables that are accessed
directly, so it doesn't really feel like a proper OO class "yet".
Also (more minor) changing it from a struct to a class would mean
updating all references to the structure, to avoid the clang warning
about mismatched tags.  There would then be some weird-looking
inconsistencies in the target-globals code.

It's OK with me.  I just wanted you to consider it, clearly you have :-)

jeff


[C++ PATCH] Improve location of CALL_EXPRs (PR c++/60862)

2014-09-12 Thread Marek Polacek
Today I've been playing again with locations in the C++ FE, but of
CALL_EXPRs only this time.  It seems that it's simplest to just set
the location after finish_call_expr does its work rather than to add
many new parameters here and there and pass the location all the way
down to build_cxx_call.  The issue is that build_cxx_call has
7437   location_t loc = EXPR_LOC_OR_LOC (fn, input_location);
7438   fn = build_call_a (fn, nargs, argarray);
7439   SET_EXPR_LOCATION (fn, loc);
but FN is often a FUNCTION_DECL, which cannot carry a location. 

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-09-12  Marek Polacek  

PR c++/60862
gcc/cp/
* parser.c (cp_parser_postfix_expression) : Set
location of a call expression.
gcc/testsuite/
* g++.dg/diagnostic/pr60862.C: New test.
libstdc++-v3/
* testsuite/20_util/bind/ref_neg.cc (test01): Adjust dg-error line
numbers.

diff --git gcc/gcc/cp/parser.c gcc/gcc/cp/parser.c
index c696fd2..1bb72bc 100644
--- gcc/gcc/cp/parser.c
+++ gcc/gcc/cp/parser.c
@@ -6227,6 +6227,8 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
koenig_p,
complain);
 
+   protected_set_expr_location (postfix_expression, loc);
+
/* The POSTFIX_EXPRESSION is certainly no longer an id.  */
idk = CP_ID_KIND_NONE;
 
diff --git gcc/gcc/testsuite/g++.dg/diagnostic/pr60862.C 
gcc/gcc/testsuite/g++.dg/diagnostic/pr60862.C
index e69de29..73b7654 100644
--- gcc/gcc/testsuite/g++.dg/diagnostic/pr60862.C
+++ gcc/gcc/testsuite/g++.dg/diagnostic/pr60862.C
@@ -0,0 +1,10 @@
+// PR c++/60862
+// { dg-do compile }
+
+extern void **bar (int, void *, int);
+
+void
+foo (int x, int y)
+{
+  int **s = bar (x, &x, y); // { dg-error "13:invalid conversion" }
+}
diff --git gcc/libstdc++-v3/testsuite/20_util/bind/ref_neg.cc 
gcc/libstdc++-v3/testsuite/20_util/bind/ref_neg.cc
index 5a46617..a85ccd8 100644
--- gcc/libstdc++-v3/testsuite/20_util/bind/ref_neg.cc
+++ gcc/libstdc++-v3/testsuite/20_util/bind/ref_neg.cc
@@ -31,9 +31,9 @@ void test01()
   const int dummy = 0;
   std::bind(&inc, _1)(0);   // { dg-error  "no match" }
   // { dg-error "rvalue|const" "" { target *-*-* } 1315 }
-  // { dg-error "rvalue|const" "" { target *-*-* } 1329 }
-  // { dg-error "rvalue|const" "" { target *-*-* } 1343 }
-  // { dg-error "rvalue|const" "" { target *-*-* } 1357 }
+  // { dg-error "rvalue|const" "" { target *-*-* } 1328 }
+  // { dg-error "rvalue|const" "" { target *-*-* } 1342 }
+  // { dg-error "rvalue|const" "" { target *-*-* } 1356 }
   std::bind(&inc, std::ref(dummy))();  // { dg-error  "no match" }
 }
 
Marek


[PATCH] Tiny cleanup for protected_set_expr_location

2014-09-12 Thread Marek Polacek
This is rather obvious.  CAN_HAVE_LOCATION_P checks that the node is
non-null, so no need to check for it in protected_set_expr_location too.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2014-09-12  Marek Polacek  

* tree.c (protected_set_expr_location): Don't check whether T is
non-null here.

diff --git gcc/tree.c gcc/tree.c
index 6ad0575..f999a3b 100644
--- gcc/tree.c
+++ gcc/tree.c
@@ -4585,7 +4585,7 @@ build_block (tree vars, tree subblocks, tree 
supercontext, tree chain)
 void
 protected_set_expr_location (tree t, location_t loc)
 {
-  if (t && CAN_HAVE_LOCATION_P (t))
+  if (CAN_HAVE_LOCATION_P (t))
 SET_EXPR_LOCATION (t, loc);
 }
 

Marek


[PATCH 4/4] Instruction attributes take an rtx_insn *

2014-09-12 Thread David Malcolm
This patch strengthens the params of the all of the various generated
get_attr_* functions from rtx to rtx_insn *, along with various other
functions relating to instruction attributes and scheduling.

As well as the changes to genattr.c, genattrtab.c and genautomata.c, the
bulk of the patch makes the small adjustments needed to the various
arch-specific config subdirectories to support these changes; basically
anywhere that calls a "get_attr_" function with something that wasn't
already an rtx_insn *.

A subtlety occurs in genautomata.c: I wasn't able to strengthen all of the
generated functions due to the possibility of some of these receiving
const0_rtx.

In particular, note that the result of:
  targetm.sched.dfa_pre_cycle_insn
is *not* always an insn: c6x's implementation:

  6796  #undef TARGET_SCHED_DFA_PRE_CYCLE_INSN
  6797  #define TARGET_SCHED_DFA_PRE_CYCLE_INSN c6x_sched_dfa_pre_cycle_insn

returns a const0_rtx:

  3953  /* Used together with the collapse_ndfa option, this ensures that we 
reach a
  3954 deterministic automaton state before trying to advance a cycle.
  3955 With collapse_ndfa, genautomata creates advance cycle arcs only for
  3956 such deterministic states.  */
  3957
  3958  static rtx
  3959  c6x_sched_dfa_pre_cycle_insn (void)
  3960  {
  3961return const0_rtx;
  3962  }

In genautomata.c: output_internal_insn_code_evaluation (when
"collapse_flag" is set), we write out code that can handle const0_rtx
as well as an insn.

The function "output_internal_insn_code_evaluation" is used when writing
out the following generated functions:
  state_transition
  min_insn_conflict_delay
  insn_latency
  maximal_insn_latency
and the fact that "insn_latency" could receive a const0_rtx also
thus also affects "internal_insn_latency".

Given that these functions can receive a const0_rtx as well as an insn
chain node, I *didn't* change their params, and instead add a checked cast
to rtx_insn * once we can prove that we're not dealing with a const0_rtx.

gcc/ChangeLog:
* config/arc/arc-protos.h (arc_attr_type): Strengthen param from
rtx to rtx_insn *.
(arc_sets_cc_p): Likewise.
* config/arc/arc.c (arc_print_operand): Use methods of
"final_sequence" for clarity, and to enable strengthening of
locals "jump" and "delay" from rtx to rtx_insn *.
(arc_adjust_insn_length): Strengthen local "prev" from rtx to
rtx_insn *; use method of rtx_sequence for typesafety.
(arc_get_insn_variants): Use insn method of rtx_sequence for
typesafety.
(arc_pad_return): Likewise.
(arc_attr_type): Strengthen param from rtx to rtx_insn *.
(arc_sets_cc_p): Likewise.  Also, convert a GET_CODE check to a
dyn_cast to rtx_sequence *, using insn method for typesafety.
* config/arc/arc.h (ADJUST_INSN_LENGTH): Add checked casts to
rtx_sequence * and use insn method when invoking get_attr_length.
* config/bfin/bfin.c (type_for_anomaly): Strengthen param from rtx
to rtx_insn *.  Replace a GET_CODE check with a dyn_cast to
rtx_sequence *, introducing a local "seq", using its insn method
from typesafety and clarity.
(add_sched_insns_for_speculation): Strengthen local "next" from
rtx to rtx_insn *.
* config/c6x/c6x.c (get_insn_side): Likewise for param "insn".
(predicate_insn): Likewise.
* config/cris/cris-protos.h (cris_notice_update_cc): Likewise for
second param.
* config/cris/cris.c (cris_notice_update_cc): Likewise.
* config/epiphany/epiphany-protos.h
(extern void epiphany_insert_mode_switch_use): Likewise for param
"insn".
(get_attr_sched_use_fpu): Likewise for param.
* config/epiphany/epiphany.c (epiphany_insert_mode_switch_use):
Likewise for param "insn".
* config/epiphany/mode-switch-use.c (insert_uses): Likewise for
param "insn" of "target_insert_mode_switch_use" callback.
* config/frv/frv.c (frv_insn_unit): Likewise for param "insn".
(frv_issues_to_branch_unit_p): Likewise.
(frv_pack_insn_p): Likewise.
(frv_compare_insns): Strengthen locals "insn1" and "insn2" from
const rtx * (i.e. mutable rtx_def * const *) to
rtx_insn * const *.
* config/i386/i386-protos.h (standard_sse_constant_opcode):
Strengthen first param from rtx to rtx_insn *.
(output_fix_trunc): Likewise.
* config/i386/i386.c (standard_sse_constant_opcode): Likewise.
(output_fix_trunc): Likewise.
(core2i7_first_cycle_multipass_filter_ready_try): Likewise for
local "insn".
(min_insn_size): Likewise for param "insn".
(get_mem_group): Likewise.
(is_cmp): Likewise.
(get_insn_path): Likewise.
(get_insn_group): Likewise.
(count_num_restricted): Likewise.
(fits_dispatch_window): Likewise.
(add_insn_window): Likewis

[PATCH 2/4] The TARGET_CAN_FOLLOW_JUMP hook takes insns

2014-09-12 Thread David Malcolm
gcc/ChangeLog:
* config/arc/arc.c (arc_can_follow_jump): Strengthen both params
from const_rtx to const rtx_insn *.  Update union members from rtx
to rtx_insn *.
* doc/tm.texi (TARGET_CAN_FOLLOW_JUMP): Autogenerated change.
* hooks.c (hook_bool_const_rtx_const_rtx_true): Rename to...
(hook_bool_const_rtx_insn_const_rtx_insn_true): ...this, and
strengthen both params from const_rtx to const rtx_insn *.
* hooks.h (hook_bool_const_rtx_const_rtx_true): Likewise.
(hook_bool_const_rtx_insn_const_rtx_insn_true): Likewise.
* reorg.c (follow_jumps): Strengthen param "jump" from rtx to
rtx_insn *.
* target.def (can_follow_jump): Strengthen both params from
const_rtx to const rtx_insn *, and update default implementation
from hook_bool_const_rtx_const_rtx_true to
hook_bool_const_rtx_insn_const_rtx_insn_true.
---
 gcc/config/arc/arc.c | 7 ---
 gcc/doc/tm.texi  | 2 +-
 gcc/hooks.c  | 6 +++---
 gcc/hooks.h  | 3 ++-
 gcc/reorg.c  | 2 +-
 gcc/target.def   | 4 ++--
 6 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 2f08e7c..9dd19de 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -420,7 +420,8 @@ arc_vector_mode_supported_p (enum machine_mode mode)
 /* TARGET_PRESERVE_RELOAD_P is still awaiting patch re-evaluation / review.  */
 static bool arc_preserve_reload_p (rtx in) ATTRIBUTE_UNUSED;
 static rtx arc_delegitimize_address (rtx);
-static bool arc_can_follow_jump (const_rtx follower, const_rtx followee);
+static bool arc_can_follow_jump (const rtx_insn *follower,
+const rtx_insn *followee);
 
 static rtx frame_insn (rtx);
 static void arc_function_arg_advance (cumulative_args_t, enum machine_mode,
@@ -9214,10 +9215,10 @@ arc_decl_pretend_args (tree decl)
   to redirect two breqs.  */
 
 static bool
-arc_can_follow_jump (const_rtx follower, const_rtx followee)
+arc_can_follow_jump (const rtx_insn *follower, const rtx_insn *followee)
 {
   /* ??? get_attr_type is declared to take an rtx.  */
-  union { const_rtx c; rtx r; } u;
+  union { const rtx_insn *c; rtx_insn *r; } u;
 
   u.c = follower;
   if (CROSSING_JUMP_P (followee))
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 685c9b2..1a19dcd 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10827,7 +10827,7 @@ filling of delay slots can result in branches being 
redirected, and this
 may in turn cause a branch offset to overflow.
 @end defmac
 
-@deftypefn {Target Hook} bool TARGET_CAN_FOLLOW_JUMP (const_rtx 
@var{follower}, const_rtx @var{followee})
+@deftypefn {Target Hook} bool TARGET_CAN_FOLLOW_JUMP (const rtx_insn 
*@var{follower}, const rtx_insn *@var{followee})
 FOLLOWER and FOLLOWEE are JUMP_INSN instructions;  return true if FOLLOWER may 
be modified to follow FOLLOWEE;  false, if it can't.  For example, on some 
targets, certain kinds of branches can't be made to  follow through a hot/cold 
partitioning.
 @end deftypefn
 
diff --git a/gcc/hooks.c b/gcc/hooks.c
index 3f11354..6000b98 100644
--- a/gcc/hooks.c
+++ b/gcc/hooks.c
@@ -116,10 +116,10 @@ hook_bool_mode_rtx_true (enum machine_mode mode 
ATTRIBUTE_UNUSED,
   return true;
 }
 
-/* Generic hook that takes (rtx, rtx) and returns true.  */
+/* Generic hook that takes (const rtx_insn *, const rtx_insn *) and returns 
true.  */
 bool
-hook_bool_const_rtx_const_rtx_true (const_rtx follower ATTRIBUTE_UNUSED,
-   const_rtx followee ATTRIBUTE_UNUSED)
+hook_bool_const_rtx_insn_const_rtx_insn_true (const rtx_insn *follower 
ATTRIBUTE_UNUSED,
+ const rtx_insn *followee 
ATTRIBUTE_UNUSED)
 {
   return true;
 }
diff --git a/gcc/hooks.h b/gcc/hooks.h
index 27ad09d..11811c2 100644
--- a/gcc/hooks.h
+++ b/gcc/hooks.h
@@ -36,7 +36,8 @@ extern bool hook_bool_mode_const_rtx_false (enum 
machine_mode, const_rtx);
 extern bool hook_bool_mode_const_rtx_true (enum machine_mode, const_rtx);
 extern bool hook_bool_mode_rtx_false (enum machine_mode, rtx);
 extern bool hook_bool_mode_rtx_true (enum machine_mode, rtx);
-extern bool hook_bool_const_rtx_const_rtx_true (const_rtx, const_rtx);
+extern bool hook_bool_const_rtx_insn_const_rtx_insn_true (const rtx_insn *,
+ const rtx_insn *);
 extern bool hook_bool_mode_uhwi_false (enum machine_mode,
   unsigned HOST_WIDE_INT);
 extern bool hook_bool_tree_false (tree);
diff --git a/gcc/reorg.c b/gcc/reorg.c
index 28401dd..ee97927 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -2307,7 +2307,7 @@ fill_simple_delay_slots (int non_jumps_p)
set *CROSSING to true, otherwise set it to false.  */
 
 static rtx
-follow_jumps (rtx label, rtx jump, bool *crossing)
+follow_jumps (rtx label, rtx_insn *jump, bool *crossing)
 {
   rtx_insn *insn;
   rtx_insn *next;
d

[PATCH 3/4] The various TARGET_ASM_..._MAX_SKIP hooks take an insn

2014-09-12 Thread David Malcolm
gcc/ChangeLog:
* config/rs6000/rs6000.c (rs6000_loop_align_max_skip): Strengthen
param "label" from rtx to rtx_insn *.
* config/rx/rx.c (rx_max_skip_for_label): Likewise for param "lab"
and local "op".
* doc/tm.texi (TARGET_ASM_JUMP_ALIGN_MAX_SKIP): Autogenerated changes.
(TARGET_ASM_LABEL_ALIGN_AFTER_BARRIER_MAX_SKIP): Likewise.
(TARGET_ASM_LOOP_ALIGN_MAX_SKIP): Likewise.
(TARGET_ASM_LABEL_ALIGN_MAX_SKIP): Likewise.
* final.c (default_label_align_after_barrier_max_skip): Strengthen
param from rtx to rtx_insn *.
(default_loop_align_max_skip): Likewise.
(default_label_align_max_skip): Likewise.
(default_jump_align_max_skip): Likewise.
* target.def (label_align_after_barrier_max_skip): Likewise.
(loop_align_max_skip): Likewise.
(label_align_max_skip): Likewise.
(jump_align_max_skip): Likewise.
* targhooks.h (default_label_align_after_barrier_max_skip):
Likewise.
(default_loop_align_max_skip): Likewise.
(default_label_align_max_skip): Likewise.
(default_jump_align_max_skip): Likewise.
---
 gcc/config/rs6000/rs6000.c | 2 +-
 gcc/config/rx/rx.c | 6 +++---
 gcc/doc/tm.texi| 8 
 gcc/final.c| 8 
 gcc/target.def | 8 
 gcc/targhooks.h| 8 
 6 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2141bc0..eca7aec 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4134,7 +4134,7 @@ rs6000_loop_align (rtx label)
 
 /* Implement TARGET_LOOP_ALIGN_MAX_SKIP. */
 static int
-rs6000_loop_align_max_skip (rtx label)
+rs6000_loop_align_max_skip (rtx_insn *label)
 {
   return (1 << rs6000_loop_align (label)) - 1;
 }
diff --git a/gcc/config/rx/rx.c b/gcc/config/rx/rx.c
index 549a443..e177fac 100644
--- a/gcc/config/rx/rx.c
+++ b/gcc/config/rx/rx.c
@@ -3207,15 +3207,15 @@ rx_align_for_label (rtx lab, int uses_threshold)
 }
 
 static int
-rx_max_skip_for_label (rtx lab)
+rx_max_skip_for_label (rtx_insn *lab)
 {
   int opsize;
-  rtx op;
+  rtx_insn *op;
 
   if (optimize_size)
 return 0;
 
-  if (lab == NULL_RTX)
+  if (lab == NULL)
 return 0;
 
   op = lab;
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 1a19dcd..396909f 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -8871,7 +8871,7 @@ to set the variable @var{align_jumps} in the target's
 selection in @var{align_jumps} in a @code{JUMP_ALIGN} implementation.
 @end defmac
 
-@deftypefn {Target Hook} int TARGET_ASM_JUMP_ALIGN_MAX_SKIP (rtx @var{label})
+@deftypefn {Target Hook} int TARGET_ASM_JUMP_ALIGN_MAX_SKIP (rtx_insn 
*@var{label})
 The maximum number of bytes to skip before @var{label} when applying
 @code{JUMP_ALIGN}.  This works only if
 @code{ASM_OUTPUT_MAX_SKIP_ALIGN} is defined.
@@ -8886,7 +8886,7 @@ to be done at such a time.  Most machine descriptions do 
not currently
 define the macro.
 @end defmac
 
-@deftypefn {Target Hook} int TARGET_ASM_LABEL_ALIGN_AFTER_BARRIER_MAX_SKIP 
(rtx @var{label})
+@deftypefn {Target Hook} int TARGET_ASM_LABEL_ALIGN_AFTER_BARRIER_MAX_SKIP 
(rtx_insn *@var{label})
 The maximum number of bytes to skip before @var{label} when applying
 @code{LABEL_ALIGN_AFTER_BARRIER}.  This works only if
 @code{ASM_OUTPUT_MAX_SKIP_ALIGN} is defined.
@@ -8906,7 +8906,7 @@ to set the variable @code{align_loops} in the target's
 selection in @code{align_loops} in a @code{LOOP_ALIGN} implementation.
 @end defmac
 
-@deftypefn {Target Hook} int TARGET_ASM_LOOP_ALIGN_MAX_SKIP (rtx @var{label})
+@deftypefn {Target Hook} int TARGET_ASM_LOOP_ALIGN_MAX_SKIP (rtx_insn 
*@var{label})
 The maximum number of bytes to skip when applying @code{LOOP_ALIGN} to
 @var{label}.  This works only if @code{ASM_OUTPUT_MAX_SKIP_ALIGN} is
 defined.
@@ -8923,7 +8923,7 @@ to set the variable @code{align_labels} in the target's
 selection in @code{align_labels} in a @code{LABEL_ALIGN} implementation.
 @end defmac
 
-@deftypefn {Target Hook} int TARGET_ASM_LABEL_ALIGN_MAX_SKIP (rtx @var{label})
+@deftypefn {Target Hook} int TARGET_ASM_LABEL_ALIGN_MAX_SKIP (rtx_insn 
*@var{label})
 The maximum number of bytes to skip when applying @code{LABEL_ALIGN}
 to @var{label}.  This works only if @code{ASM_OUTPUT_MAX_SKIP_ALIGN}
 is defined.
diff --git a/gcc/final.c b/gcc/final.c
index 1b50e74..d17b61b 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -499,25 +499,25 @@ get_attr_min_length (rtx insn)
 #endif
 
 int
-default_label_align_after_barrier_max_skip (rtx insn ATTRIBUTE_UNUSED)
+default_label_align_after_barrier_max_skip (rtx_insn *insn ATTRIBUTE_UNUSED)
 {
   return 0;
 }
 
 int
-default_loop_align_max_skip (rtx insn ATTRIBUTE_UNUSED)
+default_loop_align_max_skip (rtx_insn *insn ATTRIBUTE_UNUSED)
 {
   return align_loops_max_skip;
 }
 
 int
-default_label_align_max_skip (rtx insn ATTRIBUTE_UNUSED)
+default_label_align

[PATCH 0/4] Use rtx_insn * for instruction attributes

2014-09-12 Thread David Malcolm
This patch series strengthens more things from rtx to rtx_insn *,
eliminating checked casts.

Patches 1-3 are fairly trivial, but patch 4 is substantial,
strengthening the params of all of the various generated
get_attr_* functions from rtx to rtx_insn * (when they're not
constant and thus void), along with various other attribute-style
functions relating to instruction scheduling.

I've bootstrapped each patch cumulatively on x86_64-unknown-linux-gnu
(Fedora 20), and I've successfully built the combination of all four
with a config-list.mk build, building stage 1 for all working
configurations (currently 187 successfully-built configurations).

OK for trunk?


David Malcolm (4):
  deps_start_bb takes an insn
  The TARGET_CAN_FOLLOW_JUMP hook takes insns
  The various TARGET_ASM_..._MAX_SKIP hooks take an insn
  Instruction attributes take an rtx_insn *

 gcc/config/arc/arc-protos.h   |  4 +--
 gcc/config/arc/arc.c  | 35 ++
 gcc/config/arc/arc.h  | 20 +++
 gcc/config/bfin/bfin.c| 10 
 gcc/config/c6x/c6x.c  |  4 +--
 gcc/config/cris/cris-protos.h |  2 +-
 gcc/config/cris/cris.c|  2 +-
 gcc/config/epiphany/epiphany-protos.h |  4 +--
 gcc/config/epiphany/epiphany.c|  2 +-
 gcc/config/epiphany/mode-switch-use.c |  2 +-
 gcc/config/frv/frv.c  | 16 ++--
 gcc/config/i386/i386-protos.h |  4 +--
 gcc/config/i386/i386.c| 28 ++---
 gcc/config/m32c/m32c-protos.h |  2 +-
 gcc/config/m32c/m32c.c|  7 +++---
 gcc/config/m32r/predicates.md |  4 +--
 gcc/config/m68k/m68k-protos.h |  4 +--
 gcc/config/m68k/m68k.c| 10 
 gcc/config/mep/mep-protos.h   |  2 +-
 gcc/config/mep/mep.c  |  2 +-
 gcc/config/mips/mips-protos.h |  4 +--
 gcc/config/mips/mips.c| 12 -
 gcc/config/nds32/nds32-fp-as-gp.c |  2 +-
 gcc/config/nds32/nds32.c  |  2 +-
 gcc/config/pa/pa-protos.h |  2 +-
 gcc/config/pa/pa.c| 12 -
 gcc/config/rl78/rl78.c|  2 +-
 gcc/config/rs6000/rs6000.c| 30 +++
 gcc/config/rx/rx.c|  6 ++---
 gcc/config/s390/s390.c|  4 +--
 gcc/config/sh/sh-protos.h |  4 +--
 gcc/config/sh/sh.c| 12 -
 gcc/config/sparc/sparc-protos.h   |  6 ++---
 gcc/config/sparc/sparc.c  |  6 ++---
 gcc/config/stormy16/stormy16-protos.h |  4 +--
 gcc/config/stormy16/stormy16.c|  6 +++--
 gcc/config/v850/v850-protos.h |  2 +-
 gcc/config/v850/v850.c|  2 +-
 gcc/doc/tm.texi   | 10 
 gcc/final.c   | 24 +-
 gcc/genattr.c | 46 ++-
 gcc/genattrtab.c  | 19 ++-
 gcc/genautomata.c | 43 
 gcc/hooks.c   |  8 +++---
 gcc/hooks.h   |  5 ++--
 gcc/output.h  |  4 +--
 gcc/recog.c   |  2 +-
 gcc/recog.h   |  2 +-
 gcc/reorg.c   | 12 -
 gcc/resource.c|  4 +--
 gcc/sched-deps.c  |  4 +--
 gcc/sched-int.h   |  2 +-
 gcc/sel-sched.c   |  2 +-
 gcc/target.def| 12 -
 gcc/targhooks.h   |  8 +++---
 55 files changed, 260 insertions(+), 229 deletions(-)

-- 
1.8.5.3



[PATCH 1/4] deps_start_bb takes an insn

2014-09-12 Thread David Malcolm
gcc/ChangeLog:
* sched-deps.c (deps_start_bb): Strengthen param "head" and local
"insn" from rtx to rtx_insn *.
* sched-int.h (deps_start_bb): Likewise for 2nd param.
---
 gcc/sched-deps.c | 4 ++--
 gcc/sched-int.h  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
index cceff6d..1f3a221 100644
--- a/gcc/sched-deps.c
+++ b/gcc/sched-deps.c
@@ -3773,7 +3773,7 @@ deps_analyze_insn (struct deps_desc *deps, rtx_insn *insn)
 
 /* Initialize DEPS for the new block beginning with HEAD.  */
 void
-deps_start_bb (struct deps_desc *deps, rtx head)
+deps_start_bb (struct deps_desc *deps, rtx_insn *head)
 {
   gcc_assert (!deps->readonly);
 
@@ -3782,7 +3782,7 @@ deps_start_bb (struct deps_desc *deps, rtx head)
  hard registers correct.  */
   if (! reload_completed && !LABEL_P (head))
 {
-  rtx insn = prev_nonnote_nondebug_insn (head);
+  rtx_insn *insn = prev_nonnote_nondebug_insn (head);
 
   if (insn && CALL_P (insn))
deps->in_post_call_group_p = post_call_initial;
diff --git a/gcc/sched-int.h b/gcc/sched-int.h
index dda394e..033ca59 100644
--- a/gcc/sched-int.h
+++ b/gcc/sched-int.h
@@ -1325,7 +1325,7 @@ extern void haifa_note_reg_use (int);
 
 extern void maybe_extend_reg_info_p (void);
 
-extern void deps_start_bb (struct deps_desc *, rtx);
+extern void deps_start_bb (struct deps_desc *, rtx_insn *);
 extern enum reg_note ds_to_dt (ds_t);
 
 extern bool deps_pools_are_empty_p (void);
-- 
1.8.5.3



RE: [PATCH] gcc parallel make check

2014-09-12 Thread VandeVondele Joost
> So, I’d love to see the numbers for 5 and 20 to double check that 10 is the 
> right number to pick.  This sort of refinement is trivial post checkin.

So, some timings with the patch, I think this is great. 

Doing the testing you suggest, changing the variable doesn't influence things 
much (at least for Fortran, and  on this system).

make -j32 -k
check-fortran
real3m27.875s -> gcc_runtest_parallelize_counter_minor == 02 (several 
testsuite errors: binding_label_tests_10_main.f03, 
binding_label_tests_11_main.f03, class_45b.f03, class_4b.f03, class_4c.f03, 
coarray_29_2.f90, test_common_binding_labels_3_main.f03)
real3m26.234s -> gcc_runtest_parallelize_counter_minor == 05 (one 
additional testsuite error: whole_file_31.f90)
real3m36.405s -> gcc_runtest_parallelize_counter_minor == 10
real3m38.736s -> gcc_runtest_parallelize_counter_minor == 20
check-c
real8m26.935s
check-c++
real7m4.165s
check
real   17m45.185s




Re: [msp430] fix RLAM opcodes

2014-09-12 Thread DJ Delorie

> This fixes cases where negative indices are used for array offsets.
> Committed.
> 
>   * config/msp430/msp430.md (extendhipsi2): Use 20-bit form of RLAM/RRAM.
>   (extend_and_shift1_hipsi2): Likewise.
>   (extend_and_shift2_hipsi2): Likewise.

Committed to 4.9 branch too.


[msp430] fix RLAM opcodes

2014-09-12 Thread DJ Delorie

This fixes cases where negative indices are used for array offsets.
Committed.

* config/msp430/msp430.md (extendhipsi2): Use 20-bit form of RLAM/RRAM.
(extend_and_shift1_hipsi2): Likewise.
(extend_and_shift2_hipsi2): Likewise.

Index: gcc/config/msp430/msp430.md
===
--- gcc/config/msp430/msp430.md (revision 215228)
+++ gcc/config/msp430/msp430.md (working copy)
@@ -565,13 +565,13 @@
 )
 
 (define_insn "extendhipsi2"
   [(set (match_operand:PSI 0 "nonimmediate_operand" "=r")
(subreg:PSI (sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" 
"0")) 0))]
   "TARGET_LARGE"
-  "RLAM #4, %0 { RRAM #4, %0"
+  "RLAM.A #4, %0 { RRAM.A #4, %0"
 )
 
 ;; Look for cases where integer/pointer conversions are suboptimal due
 ;; to missing patterns, despite us not having opcodes for these
 ;; patterns.  Doing these manually allows for alternate optimization
 ;; paths.
@@ -593,21 +593,21 @@
 
 (define_insn "extend_and_shift1_hipsi2"
   [(set (subreg:SI (match_operand:PSI 0 "nonimmediate_operand" "=r") 0)
(ashift:SI (sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" 
"0"))
   (const_int 1)))]
   "TARGET_LARGE"
-  "RLAM #4, %0 { RRAM #3, %0"
+  "RLAM.A #4, %0 { RRAM.A #3, %0"
 )
 
 (define_insn "extend_and_shift2_hipsi2"
   [(set (subreg:SI (match_operand:PSI 0 "nonimmediate_operand" "=r") 0)
(ashift:SI (sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" 
"0"))
   (const_int 2)))]
   "TARGET_LARGE"
-  "RLAM #4, %0 { RRAM #2, %0"
+  "RLAM.A #4, %0 { RRAM.A #2, %0"
 )
 
 ; Nasty - we are sign-extending a 20-bit PSI value in one register into
 ; two adjacent 16-bit registers to make an SI value.  There is no MSP430X
 ; instruction that will do this, so we push the 20-bit value onto the stack
 ; and then pop it off as two 16-bit values.


Re: [debug-early] reuse variable DIEs and fix their context

2014-09-12 Thread Jason Merrill

On 09/12/2014 01:48 PM, Aldy Hernandez wrote:

Unless I'm misunderstanding something, validate_phases() verifies that
the numbers add up by looking at the actual string name of the phase,
irregardless of if you timevar_push/pop'ed it:


Yes, but why wouldn't the numbers add up?  The comment for 
timevar_push_1 says "No further elapsed time is attributed to the 
previous topmost timing variable on the stack; subsequent elapsed time 
is attributed to TIMEVAR, until it is popped or another element is 
pushed on top."


Jason



Re: [RFA 2/2]: --enable-explicit-exception-frame-registration compatibility option

2014-09-12 Thread Hans-Peter Nilsson
Ping! 

> From: Hans-Peter Nilsson 
> Date: Thu, 4 Sep 2014 23:42:28 +0200

> This adds an option to allow programs and libraries built
> *without* inhibit_libc to stay compatible with system libraries
> (really: libgcc_s.so.1) built *with* inhibit_libc, at the cost
> of the registration.  As mentioned, that's a one-way
> compatibility barrier.
> 
> While it's nice to avoid the overhead of a function call at
> DSO/program initialization time, using eh-registry isn't that
> much of overhead until an exception is thrown: most of the
> execution-time overhead will come from the additional
> symbol-table-entries due to the registry calls (typically 2 in
> all libraries and programs, as weak references) and calls to
> thread-locking as the (main) part of its work-load.  Besides the
> symbol-table-entries the footprint size is typically the code for
>   if (__register_frame_info)
> __register_frame_info (__EH_FRAME_BEGIN__, &object);
> and the corresponding deregistration and the static object
> struct (6 or 7 pointers).  (As mentioned, the library-part of
> the eh-registry support is present either way.)  So, a low-cost
> compatibility path is to always call __register_frame_info,
> despite favorable conditions for phdr usage.  Here's a patch to
> optionally do that, controlled by a configure-time option
> tentatively called --enable-explicit-exception-frame-registration
> (subject to bikeshedding if only for the length).  Note
> that there is a cost when an exception *is* thrown, the dreaded
> sorting of FDE:s.  There seems to be some obvious room for
> improvement though, as the same information is available
> *without* sorting through the PT_GNU_EH_FRAME header entry for
> the same file.
> 
> Tested with no regressions after fixing g++.old-deja/g++.eh/badalloc1.C
> (see ) for
> native x86_64-linux before compared to with patch/with patch and
> --enable-explicit-exception-frame-registration/.
> 
> Ok to commit?
> 
> libgcc:
>   * crtstuff.c [EH_FRAME_SECTION_NAME && !USE_PT_GNU_EH_FRAME]:
>   Sanity-check USE_EH_FRAME_REGISTRY_ALWAYS against
>   EH_FRAME_SECTION_NAME, emit error if unsane.
>   (USE_EH_FRAME_REGISTRY): Let USE_EH_FRAME_REGISTRY_ALWAYS
>   override USE_PT_GNU_EH_FRAME.
>   * Makefile.in (FORCE_EXPLICIT_EH_REGISTRY): New
>   variable for substituted force_explicit_eh_registry.
>   (CRTSTUFF_CFLAGS): Add FORCE_EXPLICIT_EH_REGISTRY.
>   * configure.ac (explicit-exception-frame-registration):
>   New AC_ARG_ENABLE.
>   * configure: Regenerate.
> 
> Index: libgcc/Makefile.in
> ===
> --- libgcc/Makefile.in(revision 214759)
> +++ libgcc/Makefile.in(working copy)
> @@ -50,6 +50,8 @@ target_noncanonical = @target_noncanonic
>  # The rules for compiling them should be in the t-* file for the machine.
>  EXTRA_PARTS = @extra_parts@
>  
> +FORCE_EXPLICIT_EH_REGISTRY = @force_explicit_eh_registry@
> +
>  extra-parts = libgcc-extra-parts
>  
>  # Multilib support variables.
> @@ -283,7 +285,7 @@ INTERNAL_CFLAGS = $(CFLAGS) $(LIBGCC2_CF
>  CRTSTUFF_CFLAGS = -O2 $(GCC_CFLAGS) $(INCLUDES) $(MULTILIB_CFLAGS) -g0 \
>-finhibit-size-directive -fno-inline -fno-exceptions \
>-fno-zero-initialized-in-bss -fno-toplevel-reorder -fno-tree-vectorize \
> -  -fno-stack-protector \
> +  -fno-stack-protector $(FORCE_EXPLICIT_EH_REGISTRY) \
>$(INHIBIT_LIBC_CFLAGS)
>  
>  # Extra flags to use when compiling crt{begin,end}.o.
> Index: libgcc/configure.ac
> ===
> --- libgcc/configure.ac   (revision 214759)
> +++ libgcc/configure.ac   (working copy)
> @@ -262,6 +262,22 @@ no)
>;;
>  esac
>  
> +AC_ARG_ENABLE([explicit-exception-frame-registration],
> +  [AC_HELP_STRING([--enable-explicit-exception-frame-registration],
> + [register exception tables explicitly at module start, for use
> +  e.g. for compatibility with installations without PT_GNU_EH_FRAME 
> support])],
> +[
> +force_explicit_eh_registry=
> +if test "$enable_explicit_exception_frame_registration" = yes; then
> +  if test "$enable_sjlj_exceptions" = yes; then
> +AC_MSG_ERROR([Can't enable both of --enable-sjlj-exceptions
> +  and --enable-explicit-exception-frame-registration])
> +  fi
> +  force_explicit_eh_registry=-DUSE_EH_FRAME_REGISTRY_ALWAYS
> +fi
> +])
> +AC_SUBST([force_explicit_eh_registry])
> +
>  AC_LIB_PROG_LD_GNU
>  
>  AC_MSG_CHECKING([for thread model used by GCC])
> Index: libgcc/crtstuff.c
> ===
> --- libgcc/crtstuff.c (revision 214709)
> +++ libgcc/crtstuff.c (working copy)
> @@ -131,7 +131,12 @@ call_ ## FUNC (void) 
> \
>  # define USE_PT_GNU_EH_FRAME
>  #endif
>  
> -#if defined(EH_FRAME_SECTION_N

Ping: [RFA 1/2]: Don't ignore target_header_dir when deciding inhibit_libc

2014-09-12 Thread Hans-Peter Nilsson
Ping! 

> From: Hans-Peter Nilsson 
> Date: Thu, 4 Sep 2014 23:40:40 +0200

> The directory at $target_header_dir is already inspected in
> gcc/configure, for e.g. glibc version and stack protector
> support, but not for setting inhibit_libc.  This is just
> inconsistent and the obvious resolution to me is to inhibit
> inhibit_libc when a target *does* "have its own set of headers",
> to quote the comment above the inhibit_libc setting.  There is
> nothing in the build log for "make all-gcc" that shows a
> difference with/without --with-headers, if headers are actually
> present anyway!
> 
> It may seem that libgcc/configure.ac would be the appropriate
> place to patch and test, but it is gcc/configure.ac which tests
> various things about target headers and makes the inhibit_libc
> decision, exporting it through the generated obj/gcc/libgcc.mvars
> that is included in libgcc/Makefile.
> 
> Tested before/after by "make all-gcc" on native x86_64-linux (*a) and
> seeing it still set (for the peace of most users) in
> gcc/Makefile, and cross to mipsel-linux "make all-gcc" with/without (*b,c) a
> pre-installed set of headers just implied by --prefix and --target to
> observe the intended difference and the same with (*d)
> --with-sysroot (but no headers at the sysroot) and (*e)
> --with-build-sysroot and both (*f) (note that
> --with-build-sysroot=... without --with-sysroot also got
> inhibit_libc) to observe no change for the --with-sysroot one
> (still no inhibit_libc).  The same with --with-headers
> (*g). Also, I checked that nothing other than the inhibit_libc code
> uses target_header_dir or sets the used variables in the block
> of code involved in the move.
> 
> Ok to commit?
> 
> gcc:
>   * configure.ac (target_header_dir): Move block defining
>   this to before the block setting inhibit_libc.
>   (inhibit_libc): When considering $with_headers, just
>   check it it's explicitly "no".  If not, also check if
>   $target_header_dir/stdio.h is present.  If not, set
>   inhibit_libc=true.
>   * configure: Regenerate.
> 
> Index: gcc/configure.ac
> ===
> --- gcc/configure.ac  (revision 214736)
> +++ gcc/configure.ac  (working copy)
> @@ -1924,6 +1924,22 @@ elif test "x$TARGET_SYSTEM_ROOT" != x; t
>  SYSTEM_HEADER_DIR=$build_system_header_dir 
>  fi
>  
> +if test x$host != x$target || test "x$TARGET_SYSTEM_ROOT" != x; then
> +  if test "x$with_headers" != x; then
> +target_header_dir=$with_headers
> +  elif test "x$with_sysroot" = x; then
> +
> target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-include"
> +  elif test "x$with_build_sysroot" != "x"; then
> +target_header_dir="${with_build_sysroot}${native_system_header_dir}"
> +  elif test "x$with_sysroot" = xyes; then
> +
> target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-root${native_system_header_dir}"
> +  else
> +target_header_dir="${with_sysroot}${native_system_header_dir}"
> +  fi
> +else
> +  target_header_dir=${native_system_header_dir}
> +fi
> +
>  # If this is a cross-compiler that does not
>  # have its own set of headers then define
>  # inhibit_libc
> @@ -1935,7 +1951,7 @@ fi
>  : ${inhibit_libc=false}
>  if { { test x$host != x$target && test "x$with_sysroot" = x ; } ||
> test x$with_newlib = xyes ; } &&
> - { test "x$with_headers" = x || test "x$with_headers" = xno ; } ; then
> + { test "x$with_headers" = xno || test ! -f 
> "$target_header_dir/stdio.h"; } ; then
> inhibit_libc=true
>  fi
>  AC_SUBST(inhibit_libc)
> @@ -4441,22 +4457,6 @@ if test x$with_sysroot = x && test x$hos
> && test "$prefix" != "NONE"; then
>AC_DEFINE_UNQUOTED(PREFIX_INCLUDE_DIR, "$prefix/include",
>  [Define to PREFIX/include if cpp should also search that directory.])
> -fi
> -
> -if test x$host != x$target || test "x$TARGET_SYSTEM_ROOT" != x; then
> -  if test "x$with_headers" != x; then
> -target_header_dir=$with_headers
> -  elif test "x$with_sysroot" = x; then
> -
> target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-include"
> -  elif test "x$with_build_sysroot" != "x"; then
> -target_header_dir="${with_build_sysroot}${native_system_header_dir}"
> -  elif test "x$with_sysroot" = xyes; then
> -
> target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-root${native_system_header_dir}"
> -  else
> -target_header_dir="${with_sysroot}${native_system_header_dir}"
> -  fi
> -else
> -  target_header_dir=${native_system_header_dir}
>  fi
>  
>  # Determine the version of glibc, if any, used on the target.
> 
> brgds, H-P
> 


Re: [debug-early] reuse variable DIEs and fix their context

2014-09-12 Thread Aldy Hernandez

On 09/12/14 10:33, Jason Merrill wrote:

On 09/12/2014 01:10 PM, Aldy Hernandez wrote:

TV_PHASE_DEFERRED, on the other hand, is a bit problematic because it
was originally wrapping the code inside LANG_HOOKS_WRITE_GLOBALS, which
will now reside inside the parser (and is thus included in
TV_PHASE_PARSING now).  Originally it was mutually exclusive with
TV_PHASE_PARSING, but now resides within the parser, so I decided to get
rid of it since it's all technically in the parser.

There is code in timevar*.c that makes sure that TV_PHASE_* elapsed
times add up to the total time.  So we either get rid of
TV_PHASE_DEFERRED and include its time in TV_PHASE_PARSING (avoiding
double counting), or we include a separate, non PHASE timer for it, with
timevar_push(TV_blah) where "blah" is NOT "PHASE".


Why can't it keep the same name and just timevar_push/pop instead of
timevar_start/stop?


Unless I'm misunderstanding something, validate_phases() verifies that 
the numbers add up by looking at the actual string name of the phase, 
irregardless of if you timevar_push/pop'ed it:


static char phase_prefix[] = "phase ";
...
if (strncmp (tv->name, phase_prefix, sizeof phase_prefix - 1)

I could timevar_push/pop it but we'd have to change the name:

DEFTIMEVAR (TV_PHASE_DEFERRED, "phase lang. deferred")

Did I miss something?
Aldy


Re: [PATCH] gcc parallel make check

2014-09-12 Thread Mike Stump
On Sep 12, 2014, at 9:32 AM, Jakub Jelinek  wrote:
> Here is my latest version of the patch.
> 
> With this patch I get identical test_summary output on make -k check
> (completely serial testing) and make -j48 -k check from toplevel directory.
> 
> Major changes since last version:
> 1) I've changed the granularity, now it does O_EXCL|O_CREAT attempt
>   only every 10th runtest_file_p invocation

So, I’d love to see the numbers for 5 and 20 to double check that 10 is the 
right number to pick.  This sort of refinement is trivial post checkin.

> 3) various other *.exp fails didn't use runtest_file_p, especially the
>   gcc.misc-tests/ ones, tweaked those like struct-layout-1.exp or
>   plugin.exp so that only the first runtest instance to encounter those
>   runs all of the *.exp file serially

> Regtested on x86_64-linux, ok for trunk?

Ok.  Please be around after you apply it to try and sort out any major fallout.

If someone can check their target post checkin (or help out pre-checkin) and 
report back, that would be nice.  Times before and post checkin with core count 
-j setting would be nice.

I wonder if the libstdc++ problems can be sorted out merely by finding a way to 
sort them so the expensive ones come early (regexp -> 0regexp for example).  
Or, instead of sorting them by name, sort them by some other key (md5 per 
line).  The idea then would be that the chance of all regexp tests being in one 
group is 0.

Re: [debug-early] reuse variable DIEs and fix their context

2014-09-12 Thread Jason Merrill

On 09/12/2014 01:10 PM, Aldy Hernandez wrote:

TV_PHASE_DEFERRED, on the other hand, is a bit problematic because it
was originally wrapping the code inside LANG_HOOKS_WRITE_GLOBALS, which
will now reside inside the parser (and is thus included in
TV_PHASE_PARSING now).  Originally it was mutually exclusive with
TV_PHASE_PARSING, but now resides within the parser, so I decided to get
rid of it since it's all technically in the parser.

There is code in timevar*.c that makes sure that TV_PHASE_* elapsed
times add up to the total time.  So we either get rid of
TV_PHASE_DEFERRED and include its time in TV_PHASE_PARSING (avoiding
double counting), or we include a separate, non PHASE timer for it, with
timevar_push(TV_blah) where "blah" is NOT "PHASE".


Why can't it keep the same name and just timevar_push/pop instead of 
timevar_start/stop?


Jason



Re: Remove LIBGCC2_HAS_?F_MODE target macros

2014-09-12 Thread Paul_Koning

On Sep 11, 2014, at 9:22 PM, Joseph S. Myers  wrote:

> This patch removes the LIBGCC2_HAS_{SF,DF,XF,TF}_MODE target macros,
> replacing them by predefines with -fbuilding-libgcc, together with a
> target hook that can influence those predefines when needed.
> 
> The new default is that a floating-point mode is supported in libgcc
> if (a) it passes the scalar_mode_supported_p hook (otherwise it's not
> plausible for it to be supported in libgcc) and (b) it's one of those
> four modes (since those are the modes for which libgcc hardcodes the
> possibility of support).  The target hook can override the default
> choice (in either direction) for modes that pass
> scalar_mode_supported_p (although overriding in the direction of
> returning true when the default would return false only makes sense if
> all relevant functions are specially defined in libgcc for that
> particular target).
> 
> The previous default settings depended on various settings such as
> LIBGCC2_LONG_DOUBLE_TYPE_SIZE, as well as targets defining the above
> target macros if the default wasn't correct.
> 
> The default scalar_mode_supported_p only declares a floating-point
> mode to be supported if it matches one of float / double / long
> double.  This means that in most cases where a mode is only supported
> conditionally in libgcc (TFmode only supported if it's the mode of
> long double, most commonly), the default gets things right.  Overrides
> were needed in the following cases:
> 
> * SFmode would always have been supported in libgcc (the condition was
>  BITS_PER_UNIT == 8, true for all current targets), but pdp11
>  defaults to 64-bit float, and in that case SFmode would fail
>  scalar_mode_supported_p.  I don't know if libgcc actually built for
>  pdp11 (and the port may well no longer be being used), but this
>  patch adds a scalar_mode_supported_p hook to it to ensure SFmode is
>  treated as supported.

I thought it does build.  I continue to work to keep that port alive.

The change looks fine.

The ideal solution, I think, would be to handle the choice of float length that 
the pdp11 target has via the multilib machinery.  Currently it does not do 
that.  If multilib were added for that at some point, would that require a 
change of the code in that hook?

paul


Re: [debug-early] reuse variable DIEs and fix their context

2014-09-12 Thread Aldy Hernandez

On 09/12/14 08:15, Jason Merrill wrote:

On 09/11/2014 08:51 PM, Aldy Hernandez wrote:

-  timevar_start (TV_PHASE_DEFERRED);



-  timevar_stop (TV_PHASE_DEFERRED);
-  timevar_start (TV_PHASE_OPT_GEN);


Why?


TV_PHASE_OPT_GEN is now in compile_file(), where we call 
finalize_compilation_unit directly.


TV_PHASE_DEFERRED, on the other hand, is a bit problematic because it 
was originally wrapping the code inside LANG_HOOKS_WRITE_GLOBALS, which 
will now reside inside the parser (and is thus included in 
TV_PHASE_PARSING now).  Originally it was mutually exclusive with 
TV_PHASE_PARSING, but now resides within the parser, so I decided to get 
rid of it since it's all technically in the parser.


There is code in timevar*.c that makes sure that TV_PHASE_* elapsed 
times add up to the total time.  So we either get rid of 
TV_PHASE_DEFERRED and include its time in TV_PHASE_PARSING (avoiding 
double counting), or we include a separate, non PHASE timer for it, with 
timevar_push(TV_blah) where "blah" is NOT "PHASE".


Up to you, but I'm highly in favor of getting rid of things ;-).




   /* Generate hidden aliases for Java.  */
-  if (candidates)
+  if (java_hidden_aliases)
 {
-  build_java_method_aliases (candidates);
-  delete candidates;
+  build_java_method_aliases (java_hidden_aliases);
+  delete java_hidden_aliases;
 }


Didn't it work to move this before finalize?  I think the VTV stuff is
all that really needs to come after it, and that can move out of the
front end if this hook is a problem (which I don't really think it is).


I was too chicken to try.  I will do so as a follow up.

I am committing the patch to the branch, and will address both issues 
you speak of in followups.  Let me know what you prefer for the timevar 
issue.


Thanks.
Aldy



Re: [PATCH] gcc parallel make check

2014-09-12 Thread Jakub Jelinek
On Fri, Sep 12, 2014 at 04:36:05PM +, VandeVondele  Joost wrote:
> 
> 
> >> Regtested on x86_64-linux, ok for trunk?
> >
> >Oh, forgot to say, PR56408 isn't fixed by this patch, but given the
> >higher granularity (10 tests instead of 1) we don't happen to trigger it
> >right now.
> 
> which means that any commit to that dir could trigger it, right ?

Sure, if you are unlucky.

I mean, the bug needs to be fixed, just IMHO it doesn't need to be fixed
immediately or as precondition of this patch, the bug existed there for
a year or two, though with the patch might be more likely to be triggered.

Jakub


Re: Stream ODR types

2014-09-12 Thread Jan Hubicka
> > For ODR warnings and TBAA I think i want other types, too.  But yep, we 
> > need to handle
> > gracefuly component types that does not have names and we could drop names 
> > of types
> > and handle them as component types as it seems fit.
> > 
> > OK, so if you agree, I will go ahead with this patch and we can resolve 
> > these details
> > incrementally.
> 
> Yes, but please disable !record type handing for now.
Bugzilla already has case where we report useful warning about union. I suppose
but unions and arrays would also make sense.  I will test patch limiting for
records for now and lets see how much difference it makes (real world warnings
I saw was all class types IMO)

The confused uint8 warning was my local hack in the - if warning happened on
component type I went into type it was constructed from. The anonymous arrays
indeed have different size.  Mainline just reports type difference without
giving reason and while analyzing strange reports on libreoffice I added this
hoping to get extra info. I suppose I should extent warning of type mismatch
to be able to report array size difference.

Note that uint8 mangling is same as char's (i.e. typedefs do not matter). So
stremaing those should not be terribly expensive, but we can probably just
establish equivalency by main variant as these ought to be reliably merged?

Honza
> 
> Richard.


Re: [PATCH i386 AVX512] [40/n] Extend vcvtps2ph insn patterns.

2014-09-12 Thread Uros Bizjak
On Fri, Sep 12, 2014 at 3:57 PM, Kirill Yukhin  wrote:

> Patch in the bottom extends vcvtps2ph insn
> patterns
>
> Bootstrapped.
> AVX-512* tests on top of patch-set all pass
> under simulator.
>
> Is it ok for trunk?
>
> gcc/
> * config/i386/sse.md
> (define_insn "vcvtph2ps"): Add masking.
> (define_insn "*vcvtph2ps_load"): Ditto.
> (define_insn "vcvtph2ps256"): Ditto.
> (define_expand "vcvtps2ph_mask"): New.
> (define_insn "*vcvtps2ph"): Add masking.
> (define_insn "*vcvtps2ph_store"): Ditto.
> (define_insn "vcvtps2ph256"): Ditto.

OK.

Thanks,
Uros.


Re: [PATCH i386 AVX512] [39/n] Extend ashrv insn patterns.

2014-09-12 Thread Uros Bizjak
On Fri, Sep 12, 2014 at 3:45 PM, Kirill Yukhin  wrote:

> Patch in the bottom (derived with git diff -w)
> extends ashrv insns patterns.
> I choosen to add `_1' to `VI24_AVX512BW_1' mode iterator
> because of it is irregular.
>
> Bootstrapped.
> AVX-512* tests on top of patch-set all pass
> under simulator.
>
> Is it ok for trunk?
>
> gcc/
> * config/i386/sse.md (define_mode_iterator VI248_AVX512BW_AVX512VL):
> New.
> (define_mode_iterator VI24_AVX512BW_1): Ditto.
> (define_insn "ashr3"): Ditto.
> (define_insn "ashrv2di3"): Ditto.
> (define_insn "ashr3"): Update
> mode iterator.
> (define_expand "ashrv2di3"): Update to enable TARGET_AVX512VL.

Enable also for TARGET_AVX512VL.

OK.

Thanks,
Uros.


RE: [PATCH] gcc parallel make check

2014-09-12 Thread VandeVondele Joost


>> Regtested on x86_64-linux, ok for trunk?
>
>Oh, forgot to say, PR56408 isn't fixed by this patch, but given the
>higher granularity (10 tests instead of 1) we don't happen to trigger it
>right now.

which means that any commit to that dir could trigger it, right ?

Re: [PATCH] gcc parallel make check

2014-09-12 Thread Jakub Jelinek
On Fri, Sep 12, 2014 at 06:32:41PM +0200, Jakub Jelinek wrote:
> Regtested on x86_64-linux, ok for trunk?

Oh, forgot to say, PR56408 isn't fixed by this patch, but given the
higher granularity (10 tests instead of 1) we don't happen to trigger it
right now.

Jakub


Re: [PATCH] gcc parallel make check

2014-09-12 Thread Jakub Jelinek
On Fri, Sep 12, 2014 at 09:47:00AM +, VandeVondele  Joost wrote:
> Obviously, if Jakub's patch can be made to work around the testsuite
> special cases, I believe it should be superior.  If not, the attached
> patch is working as far as I can tell, and provides a significant
> improvement over current trunk.

Here is my latest version of the patch.

With this patch I get identical test_summary output on make -k check
(completely serial testing) and make -j48 -k check from toplevel directory.

Major changes since last version:
1) I've changed the granularity, now it does O_EXCL|O_CREAT attempt
   only every 10th runtest_file_p invocation rather than every iteration,
   so that it spends a little bit less time in expect processes (especially
   if used on slower networked filesystems etc.)
2) in guality.exp I've discovered that it counts as PASS/FAIL also the
   initial check whether any guality tests should be performed; that stuff
   isn't runtest_file_p guarded (of course, needs to be performed by every
   runtest instance), but made the totals dependent on how many runtest
   instances invoked guality.exp; IMHO we don't want the pass/fail counts
   to be volatile, so I've fixed it by not counting that test into the
   results
3) various other *.exp fails didn't use runtest_file_p, especially the
   gcc.misc-tests/ ones, tweaked those like struct-layout-1.exp or
   plugin.exp so that only the first runtest instance to encounter those
   runs all of the *.exp file serially
4) fixed go-test.exp so that runtest_file_p used for a test that
   has been already runtest_file_p checked is not testing for parallel
   execution - otherwise the numbers in go-parallel/ directory could be
   different between different instances
5) libstdc++-v3/testsuite has been changed too to use the same stuff;
   note that the new xmethods.exp apparently isn't invoked for
   parallel testing, but it hasn't been before either (and the tree
   I've been testing on the 16way box still didn't have it).  Either
   it should be tested serially like abi.exp or pretty-printers.exp,
   or needs to be double-checked for parallel testing (guess it
   could have similar issues as guality.exp)
6) changed -a to ] && [

Regtested on x86_64-linux, ok for trunk?

2014-09-12  Jakub Jelinek  

gcc/
* Makefile.in (dg_target_exps): Remove.
(check_gcc_parallelize): Change to just an upper bound number.
(check-%-subtargets): Always print the non-parallelized goals.
(check_p_vars, check_p_comma, check_p_subwork): Remove.
(check_p_count, check_p_numbers0, check_p_numbers1, check_p_numbers2,
check_p_numbers3, check_p_numbers4, check_p_numbers5,
check_p_numbers6): New variables.
(check_p_numbers): Set to sequence from 1 to .
(check_p_subdirs): Set to sequence from 1 to minimum of
$(check_p_count) and either GCC_TEST_PARALLEL_SLOTS env var if set,
or 128.
(check-%, check-parallel-%): Rewritten so that for parallelized
testing each job runs all the *.exp files, with
GCC_RUNTEST_PARALLELIZE_DIR set in environment.
gcc/go/
* Make-lang.in (check_go_parallelize): Change to just an upper bound
number.
gcc/fortran/
* Make-lang.in (check_gfortran_parallelize): Change to just an upper
bound number.
gcc/cp/
* Make-lang.in (check_g++_parallelize): Change to just an upper bound
number.
gcc/objc/
* Make-lang.in (check_objc_parallelize): Change to just an upper
bound number.
gcc/testsuite/
* lib/gcc-defs.exp (gcc_parallel_test_run_p,
gcc_parallel_test_enable): New procedures.  If
GCC_RUNTEST_PARALLELIZE_DIR is set in environment, override
runtest_file_p to invoke also gcc_parallel_test_run_p.
* g++.dg/guality/guality.exp (check_guality): Save/restore
test_counts array around the body of the procedure.
* gcc.dg/guality/guality.exp (check_guality): Likewise.
* g++.dg/compat/struct-layout-1.exp: Run all the tests serially
by the first parallel runtest encountering it.
* g++.dg/plugin/plugin.exp: Likewise.
* gcc.dg/compat/struct-layout-1.exp: Likewise.
* gcc.dg/plugin/plugin.exp: Likewise.
* objc.dg/gnu-encoding/gnu-encoding.exp: Likewise.
* gcc.misc-tests/matrix1.exp: Likewise.
* gcc.misc-tests/dhry.exp: Likewise.
* gcc.misc-tests/acker1.exp: Likewise.
* gcc.misc-tests/linkage.exp: Likewise.
* gcc.misc-tests/mg.exp: Likewise.
* gcc.misc-tests/mg-2.exp: Likewise.
* gcc.misc-tests/sort2.exp: Likewise.
* gcc.misc-tests/sieve.exp: Likewise.
* gcc.misc-tests/options.exp: Likewise.
* gcc.misc-tests/help.exp: Likewise.
* go.test/go-test.exp (go-gc-tests): Use
gcc_parallel_test_enable {0, 1} around all handling of
each test.
libstdc++-v3/
* testsuite/Makefile.am (check_p_numbers0, check_p_n

Re: [debug-early] reuse variable DIEs and fix their context

2014-09-12 Thread Jason Merrill

On 09/11/2014 08:51 PM, Aldy Hernandez wrote:

-  timevar_start (TV_PHASE_DEFERRED);



-  timevar_stop (TV_PHASE_DEFERRED);
-  timevar_start (TV_PHASE_OPT_GEN);


Why?


   /* Generate hidden aliases for Java.  */
-  if (candidates)
+  if (java_hidden_aliases)
 {
-  build_java_method_aliases (candidates);
-  delete candidates;
+  build_java_method_aliases (java_hidden_aliases);
+  delete java_hidden_aliases;
 }


Didn't it work to move this before finalize?  I think the VTV stuff is 
all that really needs to come after it, and that can move out of the 
front end if this hook is a problem (which I don't really think it is).


Jason



Re: C++ PATCH for c++/63201 (member variable template specialization)

2014-09-12 Thread Ville Voutilainen
On 12 September 2014 17:41, Jason Merrill  wrote:
> On 09/12/2014 02:06 AM, Ville Voutilainen wrote:
>>
>> I'd expect you want to remove the dg-bogus directives from the test,
>> and the "// odd diagnostic" comment too?
> They aren't necessary, but they give information about former bugs that the
> test found, so I'm inclined to leave them in.


Oh, ok - I mistakenly thought that the dg-bogus might cause problems with the
testsuite reporting.


Re: C++ PATCH for c++/63201 (member variable template specialization)

2014-09-12 Thread Jason Merrill

On 09/12/2014 02:06 AM, Ville Voutilainen wrote:

I'd expect you want to remove the dg-bogus directives from the test,
and the "// odd diagnostic" comment too?


They aren't necessary, but they give information about former bugs that 
the test found, so I'm inclined to leave them in.


Jason



[PATCH 1/2] Always set DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT correctly

2014-09-12 Thread Andi Kleen
From: Andi Kleen 

When profiling is disabled force DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT
for each function to one. This information is then preserved
through LTO.

With this patch for LTO builds -pg needs to be set on both the
LTO final link and the original source build, to allow -pg
(or -pg -fentry) to be active for that source file. This allows
to build large projects mostly with -pg, except for a few files,
and still use LTO.

Originally suggested by Richard Biener
Passes bootstrap and testing on x86_64-linux.

gcc/:

2014-09-11  Andi Kleen  

* function.c (allocate_struct_function): Force
DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT to zero when
profiling is disabled.
---
 gcc/function.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/function.c b/gcc/function.c
index c8daf95..f07fdcf 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -4555,6 +4555,9 @@ allocate_struct_function (tree fndecl, bool abstract_p)
  but is this worth the hassle?  */
   cfun->can_throw_non_call_exceptions = flag_non_call_exceptions;
   cfun->can_delete_dead_exceptions = flag_delete_dead_exceptions;
+
+  if (!profile_flag && !flag_instrument_function_entry_exit)
+   DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (fndecl) = 1;
 }
 }
 
-- 
2.1.0



[PATCH 2/2] Add some more test cases for fentry and pg

2014-09-12 Thread Andi Kleen
From: Andi Kleen 

Test fentry and no_instrument_function overriding.

No test cases for the LTO test for now, as the LTO
harness doesn't seem to support different flags for the final
link.

gcc/testsuite/:

2014-09-11  Andi Kleen  

* gcc.dg/pg-override.c: New test.
* gcc.dg/pg.c: New test.
* gcc.target/i386/fentry-override.c: New test.
* gcc.target/i386/fentry.c: New test.
---
 gcc/testsuite/gcc.dg/pg-override.c  | 18 ++
 gcc/testsuite/gcc.dg/pg.c   | 18 ++
 gcc/testsuite/gcc.target/i386/fentry-override.c | 18 ++
 gcc/testsuite/gcc.target/i386/fentry.c  | 18 ++
 4 files changed, 72 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pg-override.c
 create mode 100644 gcc/testsuite/gcc.dg/pg.c
 create mode 100644 gcc/testsuite/gcc.target/i386/fentry-override.c
 create mode 100644 gcc/testsuite/gcc.target/i386/fentry.c

diff --git a/gcc/testsuite/gcc.dg/pg-override.c 
b/gcc/testsuite/gcc.dg/pg-override.c
new file mode 100644
index 000..7cd6680
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pg-override.c
@@ -0,0 +1,18 @@
+/* Test -fprofile override */
+/* { dg-do compile } */
+/* { dg-options "-fprofile" } */
+/* { dg-final { scan-assembler-not "mcount" } } */
+/* Origin: Andi Kleen */
+extern void foobar(const char *);
+
+__attribute__((no_instrument_function)) void func(void)
+{
+  foobar ("Hello world\n");
+}
+
+__attribute__((no_instrument_function)) void func2(void)
+{
+  int i;
+  for (i = 0; i < 10; i++)
+foobar ("Hello world");
+}
diff --git a/gcc/testsuite/gcc.dg/pg.c b/gcc/testsuite/gcc.dg/pg.c
new file mode 100644
index 000..7cd6680
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pg.c
@@ -0,0 +1,18 @@
+/* Test -fprofile override */
+/* { dg-do compile } */
+/* { dg-options "-fprofile" } */
+/* { dg-final { scan-assembler-not "mcount" } } */
+/* Origin: Andi Kleen */
+extern void foobar(const char *);
+
+__attribute__((no_instrument_function)) void func(void)
+{
+  foobar ("Hello world\n");
+}
+
+__attribute__((no_instrument_function)) void func2(void)
+{
+  int i;
+  for (i = 0; i < 10; i++)
+foobar ("Hello world");
+}
diff --git a/gcc/testsuite/gcc.target/i386/fentry-override.c 
b/gcc/testsuite/gcc.target/i386/fentry-override.c
new file mode 100644
index 000..3771f19
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fentry-override.c
@@ -0,0 +1,18 @@
+/* Test -mfentry override */
+/* { dg-do compile } */
+/* { dg-options "-mfentry" } */
+/* { dg-final { scan-assembler-not "__fentry__" } } */
+/* Origin: Andi Kleen */
+extern void foobar(const char *);
+
+void __attribute__((no_instrument_function)) func(void)
+{
+  foobar ("Hello world\n");
+}
+
+void __attribute__((no_instrument_function)) func2(void)
+{
+  int i;
+  for (i = 0; i < 10; i++)
+foobar ("Hello world");
+}
diff --git a/gcc/testsuite/gcc.target/i386/fentry.c 
b/gcc/testsuite/gcc.target/i386/fentry.c
new file mode 100644
index 000..bd3db13
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/fentry.c
@@ -0,0 +1,18 @@
+/* Test -mfentry */
+/* { dg-do compile } */
+/* { dg-options "-fprofile -mfentry" } */
+/* { dg-final { scan-assembler "__fentry__" } } */
+/* Origin: Andi Kleen */
+extern void foobar(const char *);
+
+void func(void)
+{
+  foobar ("Hello world\n");
+}
+
+void func2(void)
+{
+  int i;
+  for (i = 0; i < 10; i++)
+foobar ("Hello world");
+}
-- 
2.1.0



[PATCH, committed] params 2 and 3 of reg_set_between_p

2014-09-12 Thread David Malcolm
The attached patch strengthens the "from_insn" and "to_insn" params of
reg_set_between_p from rtx to rtx_insn * (along with some vars at a
couple of callsites), and thus falls under the pre-approval granted by
Jeff here:
  https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01310.html

Bootstrapped on x86_64-unknown-linux-gnu (Fedora 20), and has been
rebuilt as part of a config-list.mk build for all working configurations
(albeit with other patches for the latter case).

Committed to trunk as r215222.
Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c	(revision 215221)
+++ gcc/rtlanal.c	(revision 215222)
@@ -947,10 +947,9 @@
FROM_INSN and TO_INSN (exclusive of those two).  */
 
 int
-reg_set_between_p (const_rtx reg, const_rtx uncast_from_insn, const_rtx to_insn)
+reg_set_between_p (const_rtx reg, const rtx_insn *from_insn,
+		   const rtx_insn *to_insn)
 {
-  const rtx_insn *from_insn =
-safe_as_a  (uncast_from_insn);
   const rtx_insn *insn;
 
   if (from_insn == to_insn)
Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 215221)
+++ gcc/ChangeLog	(revision 215222)
@@ -1,3 +1,13 @@
+2014-09-12  David Malcolm  
+
+	* config/alpha/alpha.c (alpha_ra_ever_killed): Replace NULL_RTX
+	with NULL when dealing with an insn.
+	* config/sh/sh.c (sh_reorg): Strengthen local "last_float_move"
+	from rtx to rtx_insn *.
+	* rtl.h (reg_set_between_p): Strengthen params 2 and 3 from
+	const_rtx to const rtx_insn *.
+	* rtlanal.c (reg_set_between_p): Likewise, removing a checked cast.
+
 2014-09-12  Trevor Saunders  
 
 	* hash-table.h (gt_pch_nx): Don't call gt_pch_note_object within an
Index: gcc/rtl.h
===
--- gcc/rtl.h	(revision 215221)
+++ gcc/rtl.h	(revision 215222)
@@ -2759,7 +2759,7 @@
 extern int count_occurrences (const_rtx, const_rtx, int);
 extern int reg_referenced_p (const_rtx, const_rtx);
 extern int reg_used_between_p (const_rtx, const rtx_insn *, const rtx_insn *);
-extern int reg_set_between_p (const_rtx, const_rtx, const_rtx);
+extern int reg_set_between_p (const_rtx, const rtx_insn *, const rtx_insn *);
 extern int commutative_operand_precedence (rtx);
 extern bool swap_commutative_operands_p (rtx, rtx);
 extern int modified_between_p (const_rtx, const rtx_insn *, const rtx_insn *);
Index: gcc/config/alpha/alpha.c
===
--- gcc/config/alpha/alpha.c	(revision 215221)
+++ gcc/config/alpha/alpha.c	(revision 215222)
@@ -5001,7 +5001,7 @@
   top = get_insns ();
   pop_topmost_sequence ();
 
-  return reg_set_between_p (gen_rtx_REG (Pmode, REG_RA), top, NULL_RTX);
+  return reg_set_between_p (gen_rtx_REG (Pmode, REG_RA), top, NULL);
 }
 
 
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 215221)
+++ gcc/config/sh/sh.c	(revision 215222)
@@ -6281,7 +6281,8 @@
 	  /* Scan ahead looking for a barrier to stick the constant table
 	 behind.  */
 	  rtx_insn *barrier = find_barrier (num_mova, mova, insn);
-	  rtx last_float_move = NULL_RTX, last_float = 0, *last_float_addr = NULL;
+	  rtx_insn *last_float_move = NULL;
+	  rtx last_float = 0, *last_float_addr = NULL;
 	  int need_aligned_label = 0;
 
 	  if (num_mova && ! mova_p (mova))


Re: [patch] Make std::deque meet C++11 allocator requirements

2014-09-12 Thread Jonathan Wakely

This follow-up patch tells the debug mode deque that its std::deque
base class is a C++11 allocator aware container.

Tested x86_64-linux, committed to trunk.
commit 22ed11e31eb61d45f16b93b64b7b46e02e754b79
Author: Jonathan Wakely 
Date:   Fri Sep 12 12:55:46 2014 +0100

	* include/debug/deque (__gnu_debug::deque): Make base class C++11
	allocator aware.

diff --git a/libstdc++-v3/include/debug/deque b/libstdc++-v3/include/debug/deque
index 824cb28..c17a3e1 100644
--- a/libstdc++-v3/include/debug/deque
+++ b/libstdc++-v3/include/debug/deque
@@ -43,12 +43,12 @@ namespace __debug
 class deque
 : public __gnu_debug::_Safe_container<
 	deque<_Tp, _Allocator>, _Allocator,
-	__gnu_debug::_Safe_sequence, false>,
+	__gnu_debug::_Safe_sequence>,
   public _GLIBCXX_STD_C::deque<_Tp, _Allocator>
 {
   typedef  _GLIBCXX_STD_C::deque<_Tp, _Allocator>		_Base;
   typedef __gnu_debug::_Safe_container<
-	deque, _Allocator, __gnu_debug::_Safe_sequence, false>	_Safe;
+	deque, _Allocator, __gnu_debug::_Safe_sequence>	_Safe;
 
   typedef typename _Base::const_iterator	_Base_const_iterator;
   typedef typename _Base::iterator		_Base_iterator;


[PATCH i386 AVX512] [40/n] Extend vcvtps2ph insn patterns.

2014-09-12 Thread Kirill Yukhin
Hello,
Patch in the bottom extends vcvtps2ph insn
patterns

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_insn "vcvtph2ps"): Add masking.
(define_insn "*vcvtph2ps_load"): Ditto.
(define_insn "vcvtph2ps256"): Ditto.
(define_expand "vcvtps2ph_mask"): New.
(define_insn "*vcvtps2ph"): Add masking.
(define_insn "*vcvtps2ph_store"): Ditto.
(define_insn "vcvtps2ph256"): Ditto.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index b5ded79..bd321fc 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -16423,35 +16423,35 @@
(set_attr "prefix" "maybe_evex")
(set_attr "mode" "")])
 
-(define_insn "vcvtph2ps"
-  [(set (match_operand:V4SF 0 "register_operand" "=x")
+(define_insn "vcvtph2ps"
+  [(set (match_operand:V4SF 0 "register_operand" "=v")
(vec_select:V4SF
- (unspec:V8SF [(match_operand:V8HI 1 "register_operand" "x")]
+ (unspec:V8SF [(match_operand:V8HI 1 "register_operand" "v")]
   UNSPEC_VCVTPH2PS)
  (parallel [(const_int 0) (const_int 1)
 (const_int 2) (const_int 3)])))]
-  "TARGET_F16C"
-  "vcvtph2ps\t{%1, %0|%0, %1}"
+  "TARGET_F16C || TARGET_AVX512VL"
+  "vcvtph2ps\t{%1, %0|%0, %1}"
   [(set_attr "type" "ssecvt")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "maybe_evex")
(set_attr "mode" "V4SF")])
 
-(define_insn "*vcvtph2ps_load"
-  [(set (match_operand:V4SF 0 "register_operand" "=x")
+(define_insn "*vcvtph2ps_load"
+  [(set (match_operand:V4SF 0 "register_operand" "=v")
(unspec:V4SF [(match_operand:V4HI 1 "memory_operand" "m")]
 UNSPEC_VCVTPH2PS))]
-  "TARGET_F16C"
-  "vcvtph2ps\t{%1, %0|%0, %1}"
+  "TARGET_F16C || TARGET_AVX512VL"
+  "vcvtph2ps\t{%1, %0|%0, %1}"
   [(set_attr "type" "ssecvt")
(set_attr "prefix" "vex")
(set_attr "mode" "V8SF")])
 
-(define_insn "vcvtph2ps256"
-  [(set (match_operand:V8SF 0 "register_operand" "=x")
-   (unspec:V8SF [(match_operand:V8HI 1 "nonimmediate_operand" "xm")]
+(define_insn "vcvtph2ps256"
+  [(set (match_operand:V8SF 0 "register_operand" "=v")
+   (unspec:V8SF [(match_operand:V8HI 1 "nonimmediate_operand" "vm")]
 UNSPEC_VCVTPH2PS))]
-  "TARGET_F16C"
-  "vcvtph2ps\t{%1, %0|%0, %1}"
+  "TARGET_F16C || TARGET_AVX512VL"
+  "vcvtph2ps\t{%1, %0|%0, %1}"
   [(set_attr "type" "ssecvt")
(set_attr "prefix" "vex")
(set_attr "btver2_decode" "double")
@@ -16468,6 +16468,19 @@
(set_attr "prefix" "evex")
(set_attr "mode" "V16SF")])
 
+(define_expand "vcvtps2ph_mask"
+  [(set (match_operand:V8HI 0 "register_operand")
+   (vec_merge:V8HI
+ (vec_concat:V8HI
+   (unspec:V4HI [(match_operand:V4SF 1 "register_operand")
+ (match_operand:SI 2 "const_0_to_255_operand")]
+ UNSPEC_VCVTPS2PH)
+   (match_dup 5))
+  (match_operand:V8HI 3 "vector_move_operand")
+  (match_operand:QI 4 "register_operand")))]
+  "TARGET_AVX512VL"
+  "operands[5] = CONST0_RTX (V4HImode);")
+
 (define_expand "vcvtps2ph"
   [(set (match_operand:V8HI 0 "register_operand")
(vec_concat:V8HI
@@ -16478,39 +16491,39 @@
   "TARGET_F16C"
   "operands[3] = CONST0_RTX (V4HImode);")
 
-(define_insn "*vcvtps2ph"
-  [(set (match_operand:V8HI 0 "register_operand" "=x")
+(define_insn "*vcvtps2ph"
+  [(set (match_operand:V8HI 0 "register_operand" "=v")
(vec_concat:V8HI
- (unspec:V4HI [(match_operand:V4SF 1 "register_operand" "x")
+ (unspec:V4HI [(match_operand:V4SF 1 "register_operand" "v")
(match_operand:SI 2 "const_0_to_255_operand" "N")]
   UNSPEC_VCVTPS2PH)
  (match_operand:V4HI 3 "const0_operand")))]
-  "TARGET_F16C"
-  "vcvtps2ph\t{%2, %1, %0|%0, %1, %2}"
+  "TARGET_F16C && "
+  "vcvtps2ph\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecvt")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "maybe_evex")
(set_attr "mode" "V4SF")])
 
-(define_insn "*vcvtps2ph_store"
+(define_insn "*vcvtps2ph_store"
   [(set (match_operand:V4HI 0 "memory_operand" "=m")
(unspec:V4HI [(match_operand:V4SF 1 "register_operand" "x")
  (match_operand:SI 2 "const_0_to_255_operand" "N")]
 UNSPEC_VCVTPS2PH))]
-  "TARGET_F16C"
-  "vcvtps2ph\t{%2, %1, %0|%0, %1, %2}"
+  "TARGET_F16C || TARGET_AVX512VL"
+  "vcvtps2ph\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecvt")
-   (set_attr "prefix" "vex")
+   (set_attr "prefix" "maybe_evex")
(set_attr "mode" "V4SF")])
 
-(define_insn "vcvtps2ph256"
+(define_insn "vcvtps2ph256"
   [(set (match_operand:V8HI 0 "nonimmediate_operand" "=xm")
(unspec:V8HI [(match_operand:V8SF 1 "register_operand" "x")
  (match_operand:SI 2 "const_0_to_255_operand" "N")]
 UNSPEC_VCVTPS2PH))]
-  "T

Re: [PING] [PATCH] longlong.h: Add prototype for udiv_w_sdiv

2014-09-12 Thread Ian Lance Taylor
On Fri, Sep 12, 2014 at 1:29 AM, Stefan Liebler  wrote:
>
> the patch from Andreas Krebbel
> (https://gcc.gnu.org/ml/gcc-patches/2014-02/msg00194.html) adds a prototype
> for __udiv_w_sdiv to longlong.h if needed.
>
> This fixes a build failure of glibc on s390 31 bit.
> (see "Re: [PATCH] Turn implict-function-declaration warnings into errors",
> https://www.sourceware.org/ml/libc-alpha/2014-09/msg00264.html)
>
> Please review Andreas Krebbel愀 patch and give okay for commit.

Andreas's patch is OK.

Thanks.

Ian


[PATCH i386 AVX512] [39/n] Extend ashrv insn patterns.

2014-09-12 Thread Kirill Yukhin
Hello,
Patch in the bottom (derived with git diff -w)
extends ashrv insns patterns.
I choosen to add `_1' to `VI24_AVX512BW_1' mode iterator
because of it is irregular.

Bootstrapped.
AVX-512* tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md (define_mode_iterator VI248_AVX512BW_AVX512VL):
New.
(define_mode_iterator VI24_AVX512BW_1): Ditto.
(define_insn "ashr3"): Ditto.
(define_insn "ashrv2di3"): Ditto.
(define_insn "ashr3"): Update
mode iterator.
(define_expand "ashrv2di3"): Update to enable TARGET_AVX512VL.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 73bdd22..b5ded79 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -382,6 +382,15 @@
(V8SI "TARGET_AVX2") V4SI
(V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX2") V2DI])
 
+(define_mode_iterator VI248_AVX512BW_AVX512VL
+  [(V32HI "TARGET_AVX512BW") 
+   (V4DI "TARGET_AVX512VL") V16SI V8DI])
+
+;; Suppose TARGET_AVX512VL as baseline
+(define_mode_iterator VI24_AVX512BW_1
+ [(V16HI "TARGET_AVX512BW") (V8HI "TARGET_AVX512BW")
+  V8SI V4SI])
+   
 (define_mode_iterator VI48_AVX512F
   [(V16SI "TARGET_AVX512F") V8SI V4SI
(V8DI "TARGET_AVX512F") V4DI V2DI])
@@ -9282,12 +9291,40 @@
(set_attr "prefix" "orig,vex")
(set_attr "mode" "")])
 
+(define_insn "ashr3"
+  [(set (match_operand:VI24_AVX512BW_1 0 "register_operand" "=v,v")
+   (ashiftrt:VI24_AVX512BW_1
+ (match_operand:VI24_AVX512BW_1 1 "nonimmediate_operand" "v,vm")
+ (match_operand:SI 2 "nonmemory_operand" "v,N")))]
+  "TARGET_AVX512VL"
+  "vpsra\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "sseishft")
+   (set (attr "length_immediate")
+ (if_then_else (match_operand 2 "const_int_operand")
+   (const_string "1")
+   (const_string "0")))
+   (set_attr "mode" "")])
+
+(define_insn "ashrv2di3"
+  [(set (match_operand:V2DI 0 "register_operand" "=v,v")
+   (ashiftrt:V2DI
+ (match_operand:V2DI 1 "nonimmediate_operand" "v,vm")
+ (match_operand:DI 2 "nonmemory_operand" "v,N")))]
+  "TARGET_AVX512VL"
+  "vpsraq\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "sseishft")
+   (set (attr "length_immediate")
+ (if_then_else (match_operand 2 "const_int_operand")
+   (const_string "1")
+   (const_string "0")))
+   (set_attr "mode" "TI")])
+
 (define_insn "ashr3"
-  [(set (match_operand:VI48_512 0 "register_operand" "=v,v")
-   (ashiftrt:VI48_512
- (match_operand:VI48_512 1 "nonimmediate_operand" "v,vm")
+  [(set (match_operand:VI248_AVX512BW_AVX512VL 0 "register_operand" "=v,v")
+   (ashiftrt:VI248_AVX512BW_AVX512VL
+ (match_operand:VI248_AVX512BW_AVX512VL 1 "nonimmediate_operand" 
"v,vm")
  (match_operand:SI 2 "nonmemory_operand" "v,N")))]
-  "TARGET_AVX512F && "
+  "TARGET_AVX512F"
   "vpsra\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "sseishft")
(set (attr "length_immediate")
@@ -14912,7 +14949,9 @@
(ashiftrt:V2DI
  (match_operand:V2DI 1 "register_operand")
  (match_operand:DI 2 "nonmemory_operand")))]
-  "TARGET_XOP"
+  "TARGET_XOP || TARGET_AVX512VL"
+{
+  if (!TARGET_AVX512VL)
 {
   rtx reg = gen_reg_rtx (V2DImode);
   rtx par;
@@ -14935,6 +14974,7 @@
 
   emit_insn (gen_xop_shav2di3 (operands[0], operands[1], reg));
   DONE;
+}
 })
 
 ;; XOP FRCZ support


[patch] libstdc++/59603 Prevent self-swapping in random_shuffle

2014-09-12 Thread Jonathan Wakely
Swapping an object with itself is pointless, and asserts in debug mode
(but we should probably remove that check from debug mode, since it
can happen in reasonable code).

Tested x86_64-linux, committed to trunk.

I think this makes sense for the branches too, so will backport it
soon.

commit 0a25b348c511172bd53c5fbca35942cc6c362e8b
Author: redi 
Date:   Fri Sep 12 13:30:35 2014 +

	PR libstdc++/59603
	* include/bits/stl_algo.h (random_shuffle): Prevent self-swapping.
	* testsuite/25_algorithms/random_shuffle/59603.cc: New.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215219 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
index 4c6ca8a..f2dfc20 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -4430,7 +4430,13 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
 
   if (__first != __last)
 	for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
-	  std::iter_swap(__i, __first + (std::rand() % ((__i - __first) + 1)));
+	  {
+	// XXX rand() % N is not uniformly distributed
+	_RandomAccessIterator __j = __first
+	+ std::rand() % ((__i - __first) + 1);
+	if (__i != __j)
+	  std::iter_swap(__i, __j);
+	  }
 }
 
   /**
@@ -4464,7 +4470,11 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
   if (__first == __last)
 	return;
   for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
-	std::iter_swap(__i, __first + __rand((__i - __first) + 1));
+	{
+	  _RandomAccessIterator __j = __first + __rand((__i - __first) + 1);
+	  if (__i != __j)
+	std::iter_swap(__i, __j);
+	}
 }
 
 
diff --git a/libstdc++-v3/testsuite/25_algorithms/random_shuffle/59603.cc b/libstdc++-v3/testsuite/25_algorithms/random_shuffle/59603.cc
new file mode 100644
index 000..7b179ac
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/random_shuffle/59603.cc
@@ -0,0 +1,34 @@
+// Copyright (C) 2014 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++11" }
+// { dg-require-debug-mode "" }
+
+// libstdc++/59603
+
+#include 
+#include 
+
+struct C {
+std::vector v;
+C (int a) : v{a} {};
+};
+
+int main () {
+std::vector cs { {1}, {2}, {3}, {4} };
+std::random_shuffle(cs.begin(), cs.end());
+}


Re: ptx preliminary address space fixes [1/4]

2014-09-12 Thread Richard Biener
On Fri, Sep 12, 2014 at 1:56 PM, Bernd Schmidt  wrote:
> On 09/12/2014 01:48 PM, Richard Biener wrote:
>>>
>>> Still testing whether I actually strictly need it for ARRAY_TYPE nowadays
>>> (these patches are really old...). However, the TYPE_FIELDS of a
>>> RECORD_TYPE
>>> seem to be mostly ignored once the frontends are done, but it's very easy
>>> for other parts of the compiler to take the TREE_TYPE of an ARRAY_TYPE.
>>> Fixing that up is simple and seems like a good thing to do for
>>> consistency
>>> (I notice that maybe I should add VECTOR_TYPE).
>>
>>
>> Well, for an access a->b the COMPONENT_REF specifies the type
>> of the reference which uses the type of the FIELD_DECL...  IVOPTs
>> for example may produce
>>
>>   ptr *p = &a->b;
>>   *p;
>>
>> from that with ptr * built from TREE_TYPE of that expression.
>
>
> Yes, but that expression is the COMPONENT_REF. While that may initially use
> the type from the FIELD_DECL, afterwards it is independent from it (and
> types on COMPONENT_REFs and ARRAY_REFs are changed by the lower-as pass on
> ptx).  We're not really looking at the FIELD_DECLs for anything important
> AFAIK.

You figured out SRA yourself.  Btw, I still detest the use of a lowering
pass for PTX (changing types in-place even more so).  Maybe you
want to do the lowering on RTL where you can simply adjust the
affected MEMs MEM_ATTRs.  And wouldn't it be nice if you can
do similar things on GIMPLE? ;)

Richard.

>
> Bernd
>


[PATCH, PR63229] fix assert in hash_table pch routines

2014-09-12 Thread tsaunders
From: tbsaunde 

Hi,

should be obvious changing things within an assert is a bad idea, but somehow I 
forgot :(

tested x86_64-unknown-linux-gnu, and commited as obvious.

Trev

gcc/ChangeLog:

2014-09-12  Trevor Saunders  

* hash-table.h (gt_pch_nx): Don't call gt_pch_note_object within an
assert.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@215216 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog| 5 +
 gcc/hash-table.h | 5 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c048672..5b27aa8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2014-09-12  Trevor Saunders  
+
+   * hash-table.h (gt_pch_nx): don't call gt_pch_note_object within an
+   assert.
+
 2014-09-12  Joseph Myers  
 
* target.def (libgcc_floating_mode_supported_p): New hook.
diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index c2a68fd..028b7de 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -1598,8 +1598,9 @@ template
 static void
 gt_pch_nx (hash_table *h)
 {
-  gcc_checking_assert (gt_pch_note_object (h->m_entries, h,
-  hashtab_entry_note_pointers));
+  bool success ATTRIBUTE_UNUSED
+= gt_pch_note_object (h->m_entries, h, hashtab_entry_note_pointers);
+  gcc_checking_assert (success);
   for (size_t i = 0; i < h->m_size; i++)
 {
   if (hash_table::is_empty (h->m_entries[i])
-- 
2.1.0



Re: ptx preliminary address space fixes [3/4]

2014-09-12 Thread Richard Biener
On Fri, Sep 12, 2014 at 2:14 PM, Bernd Schmidt  wrote:
> On 09/12/2014 01:45 PM, Richard Biener wrote:
>>
>> Fixing up the vector type in advance is ok with me but I'd like us to
>> move away from address-space-on-types.
>
>
> After thinking about it for a while, this idea makes no sense. Address
> spaces must be represented in the type system somehow - consider a pointer
> to an object in address space 0 vs. a pointer to an object in address space
> 1.  These are different types, they may even have different sizes.
>
> So by adding address spaces to references (_DECLs and _REFs) the only thing
> we'd accomplish is duplicating existing information, with enhanced chances
> of getting inconsistencies.

Well.  There are two parts of adress-space suport.  First is pointer
types which may have different size/mode.  Second is memory
references to different address-spaces which may require different
insns in the end (RTL).  On RTL we get away with "lowering"
pointers properly (the size/precision should be encoded correctly
on the tree/GIMPLE level as well).  And on RTL we have
the address-space of a MEM in its MEM_ATTRs.  On GIMPLE
we weirdly use some TYPE_QUALS on some type contained
in a memory reference tree.  I'd like to fix the latter by
placing address-space info on the reference itself (like on RTL),
not on the types.

Conveniently that would be on the base object we access
which is either a DECL or a MEM_REF/TARGET_MEM_REF.

Yes, the _frontends_ would be required to properly build
memory references to objects in different address-spaces.

Richard.

>
> Bernd
>


Re: ptx preliminary address space fixes [3/4]

2014-09-12 Thread Richard Biener
On Fri, Sep 12, 2014 at 1:51 PM, Bernd Schmidt  wrote:
> On 09/12/2014 01:45 PM, Richard Biener wrote:
>>>
>>> Let me know what you prefer.
>>
>>
>> Hmm, neither I suppose.  COMPLEX_TYPEs are also built
>> with main-variant component type and I suspect the same for
>> ARRAY_TYPEs.  I see the address-space on types as
>> artifact that comes from Frontend support (aka parsing).
>
>
>> Fixing up the vector type in advance is ok with me but I'd like us to
>> move away from address-space-on-types.
>
>
> Is that an approval for the first variant in the sense that it's the best we
> can do at the moment? Or are you requiring a rewrite of all the address
> space support in the compiler?

The former.  Of course if you want to spend the time rewriting the
GIMPLE parts of address-space support even better (shouldn't be
too hard given nobody really cares about it too much).

I just think that by modeling an API that looks like we have "fixed"
the GIMPLE parts makes it easier for somebody to do that.

Thanks,
Richard.

>
> Bernd
>


Re: ptx preliminary address space fixes [3/4]

2014-09-12 Thread Bernd Schmidt

On 09/12/2014 01:45 PM, Richard Biener wrote:

Fixing up the vector type in advance is ok with me but I'd like us to
move away from address-space-on-types.


After thinking about it for a while, this idea makes no sense. Address 
spaces must be represented in the type system somehow - consider a 
pointer to an object in address space 0 vs. a pointer to an object in 
address space 1.  These are different types, they may even have 
different sizes.


So by adding address spaces to references (_DECLs and _REFs) the only 
thing we'd accomplish is duplicating existing information, with enhanced 
chances of getting inconsistencies.



Bernd



Re: ptx preliminary address space fixes [1/4]

2014-09-12 Thread Bernd Schmidt

On 09/12/2014 01:48 PM, Richard Biener wrote:

Still testing whether I actually strictly need it for ARRAY_TYPE nowadays
(these patches are really old...). However, the TYPE_FIELDS of a RECORD_TYPE
seem to be mostly ignored once the frontends are done, but it's very easy
for other parts of the compiler to take the TREE_TYPE of an ARRAY_TYPE.
Fixing that up is simple and seems like a good thing to do for consistency
(I notice that maybe I should add VECTOR_TYPE).


Well, for an access a->b the COMPONENT_REF specifies the type
of the reference which uses the type of the FIELD_DECL...  IVOPTs
for example may produce

  ptr *p = &a->b;
  *p;

from that with ptr * built from TREE_TYPE of that expression.


Yes, but that expression is the COMPONENT_REF. While that may initially 
use the type from the FIELD_DECL, afterwards it is independent from it 
(and types on COMPONENT_REFs and ARRAY_REFs are changed by the lower-as 
pass on ptx).  We're not really looking at the FIELD_DECLs for anything 
important AFAIK.



Bernd



Re: ptx preliminary address space fixes [3/4]

2014-09-12 Thread Bernd Schmidt

On 09/12/2014 01:45 PM, Richard Biener wrote:

Let me know what you prefer.


Hmm, neither I suppose.  COMPLEX_TYPEs are also built
with main-variant component type and I suspect the same for
ARRAY_TYPEs.  I see the address-space on types as
artifact that comes from Frontend support (aka parsing).



Fixing up the vector type in advance is ok with me but I'd like us to
move away from address-space-on-types.


Is that an approval for the first variant in the sense that it's the 
best we can do at the moment? Or are you requiring a rewrite of all the 
address space support in the compiler?



Bernd



Re: ptx preliminary address space fixes [2/4]

2014-09-12 Thread Richard Biener
On Fri, Sep 12, 2014 at 1:17 PM, Bernd Schmidt  wrote:
> On 09/11/2014 01:30 PM, Richard Biener wrote:
>>
>> On Thu, Sep 11, 2014 at 12:12 PM, Bernd Schmidt 
>> wrote:
>>>
>>> This is a bug in SRA which replaces a memory reference without taking
>>> care
>>> to use the correct address space.
>>>
>>> Bootstrapped and tested together with the other patches on x86_64-linux.
>>> Ok?
>>
>>
>> Ok (with adjustments necessary for renaming apply_as_to_type).
>
>
> How about this variant with a suitable reference_apply_addr_space?
>
> Index: gcc/tree-sra.c
> ===
> --- gcc/tree-sra.c.orig
> +++ gcc/tree-sra.c
> @@ -1562,6 +1562,8 @@ build_ref_for_offset (location_t loc, tr
>  exp_type = build_aligned_type (exp_type, align);
>
>mem_ref = fold_build2_loc (loc, MEM_REF, exp_type, base, off);
> +  reference_apply_addr_space (mem_ref,
> + TYPE_ADDR_SPACE (TREE_TYPE (prev_base));
>if (TREE_THIS_VOLATILE (prev_base))
>  TREE_THIS_VOLATILE (mem_ref) = 1;
>if (TREE_SIDE_EFFECTS (prev_base))

Ok with using reference_addr_space (prev_base).

Thanks,
Richard.

>
> Bernd


Re: ptx preliminary address space fixes [1/4]

2014-09-12 Thread Richard Biener
On Fri, Sep 12, 2014 at 1:15 PM, Bernd Schmidt  wrote:
> On 09/11/2014 01:29 PM, Richard Biener wrote:
>>
>> +  if (TREE_CODE (type) == ARRAY_TYPE)
>> +TREE_TYPE (type) = apply_as_to_type (TREE_TYPE (type), as);
>>
>> why is this necessary for ARRAY_TYPE but not for sth like
>> a RECORD_TYPE or a POINTER_TYPE?
>
>
> Still testing whether I actually strictly need it for ARRAY_TYPE nowadays
> (these patches are really old...). However, the TYPE_FIELDS of a RECORD_TYPE
> seem to be mostly ignored once the frontends are done, but it's very easy
> for other parts of the compiler to take the TREE_TYPE of an ARRAY_TYPE.
> Fixing that up is simple and seems like a good thing to do for consistency
> (I notice that maybe I should add VECTOR_TYPE).

Well, for an access a->b the COMPONENT_REF specifies the type
of the reference which uses the type of the FIELD_DECL...  IVOPTs
for example may produce

 ptr *p = &a->b;
 *p;

from that with ptr * built from TREE_TYPE of that expression.

Btw, a similar type as VECTOR_TYPE is COMPLEX_TYPE.

> For a POINTER_TYPE, it is correct not to modify the pointed-to type. We want
> to express that a variable of that pointer type lives in an address space,
> not that the pointed-to type is different.
>
>> The name apply_as_to_type looks odd to me - other address-space
>> related functions use addr_space - can you change it to that please?
>
>
> Will change, and update the other patches accordingly.

Thanks,
Richard.

>
> Bernd
>


Re: ptx preliminary address space fixes [3/4]

2014-09-12 Thread Richard Biener
On Fri, Sep 12, 2014 at 1:09 PM, Bernd Schmidt  wrote:
> On 09/11/2014 01:39 PM, Richard Biener wrote:
>
>> Seeing this it would be nice to abstract away the exact place we store
>> the address-space in a memory reference.  So - can you add a helper
>> reference_addr_space () please?  Thus do
>>
>>addr_space_t as = reference_addr_space (scalar_dest);
>
>
> Ok, no problem.

Thanks.

>> but then I wonder why not simply build the correct vector types
>> in the first place in vect_analyze_data_refs?
>
>
> We do, kind of. But there are two problems, the first one addressed here in
> the patch:
> - data_ref = build2 (MEM_REF, TREE_TYPE (vec_oprnd),
> dataref_ptr,
> + data_ref = build2 (MEM_REF, vectype, dataref_ptr,
>
> We use the wrong vector type (vectype has the right address space, but
> vec_oprnd is an SSA_NAME).

Ah, ok.  That part is fine then.

> The second problem is our reprentation of vector types. When given
>unsigned int
> make_vector_type makes
>vector(2) unsigned int
> rather than
>vector(2)  unsigned int
>
> Since we use elem_type in a few ways in tree_vectorizable_stmt, that looks
> like it would cause problems, but it probably can be addressed in a simpler
> way than what I originally had:
>
> Index: gomp-4_0-branch/gcc/tree-vect-stmts.c
> ===
> --- gomp-4_0-branch.orig/gcc/tree-vect-stmts.c
> +++ gomp-4_0-branch/gcc/tree-vect-stmts.c
> @@ -5037,7 +5037,8 @@ vectorizable_store (gimple stmt, gimple_
>return false;
>  }
>
> -  elem_type = TREE_TYPE (vectype);
> +  elem_type = apply_addr_space_to_type (TREE_TYPE (vectype),
> +   TYPE_ADDR_SPACE (vectype));
>vec_mode = TYPE_MODE (vectype);
>
>/* FORNOW. In some cases can vectorize even if data-type not supported
> @@ -5692,7 +5693,8 @@ vectorizable_load (gimple stmt, gimple_s
>if (!STMT_VINFO_DATA_REF (stmt_info))
>  return false;
>
> -  elem_type = TREE_TYPE (vectype);
> +  elem_type = apply_addr_space_to_type (TREE_TYPE (vectype),
> +   TYPE_ADDR_SPACE (vectype));
>mode = TYPE_MODE (vectype);
>
>/* FORNOW. In some cases can vectorize even if data-type not supported
>
> The other alternative is to fix up make_vector_type. That could look like
> this:
>
> Index: gomp-4_0-branch/gcc/tree.c
> ===
> --- gomp-4_0-branch.orig/gcc/tree.c
> +++ gomp-4_0-branch/gcc/tree.c
> @@ -9470,9 +9470,13 @@ make_vector_type (tree innertype, int nu
>   inner type. Use it to build the variant we return.  */
>if ((TYPE_ATTRIBUTES (innertype) || TYPE_QUALS (innertype))
>&& TREE_TYPE (t) != innertype)
> -return build_type_attribute_qual_variant (t,
> - TYPE_ATTRIBUTES (innertype),
> - TYPE_QUALS (innertype));
> +{
> +  t = build_type_attribute_qual_variant (t,
> +TYPE_ATTRIBUTES (innertype),
> +TYPE_QUALS (innertype));
> +  TREE_TYPE (t) = apply_addr_space_to_type (TREE_TYPE (t),
> +   TYPE_ADDR_SPACE
> (innertype));
> +}
>
>return t;
>  }
>
> Let me know what you prefer.

Hmm, neither I suppose.  COMPLEX_TYPEs are also built
with main-variant component type and I suspect the same for
ARRAY_TYPEs.  I see the address-space on types as
artifact that comes from Frontend support (aka parsing).

>> Or apply the addr-space to the memory reference with a new helper
>> reference_apply_addr_space
>>
>> - data_ref = build2 (MEM_REF, TREE_TYPE (vec_oprnd),
>> dataref_ptr,
>>   dataref_offset
>>   ? dataref_offset
>>   : build_int_cst
>> (reference_alias_ptr_type
>> ..
>>   reference_apply_addr_space (data_ref, as);
>>
>> at least that's how it's abstracted on the RTL side.  I think I'd prefer
>> if things would be working that way on the tree level, too.
>
>
> I'm adapting my other patches to use such a function, but in this particular
> case I think it would be best to fix up the types in advance since we build
> accesses in several places and especially vectorizable_load also seems to
> create operations on the pointer type. Are you ok with that?

Fixing up the vector type in advance is ok with me but I'd like us to
move away from address-space-on-types.

Thanks,
Richard.

>
> Bernd
>


Re: [PATCH][match-and-simplify] Finish simplify_bitwise_binary patterns

2014-09-12 Thread Richard Biener
On Fri, 12 Sep 2014, Marc Glisse wrote:

> On Fri, 12 Sep 2014, Richard Biener wrote:
> 
> > +/* x ^ ~0 -> ~x  */
> > (simplify
> >   (bit_and @0 integer_all_onesp)
> >   @0)
> 
> The comment doesn't seem to match.

Thanks - fixed below which also implements simplify_mult and
simplify_not_neg_expr.

Committed.

Richard.

2014-09-12  Richard Biener  

* match-constant-folding.pd (x & ~0 -> x): Move ...
* match-bitwise.pd: ... here.  Implement simplify_not_neg_expr.
* match-plusminus.pd: Likewise.  Implement simplify_mult.

Index: gcc/match-bitwise.pd
===
--- gcc/match-bitwise.pd(revision 215212)
+++ gcc/match-bitwise.pd(working copy)
@@ -135,10 +135,23 @@ along with GCC; see the file COPYING3.
&& TYPE_PRECISION (TREE_TYPE (@1)) == 1)
(le @0 @1)))
 
+/* From tree-ssa-forwprop.c:simplify_not_neg_expr.  */
+
+/* ~~x -> x */
+(simplify
+  (bit_not (bit_not @0))
+  @0)
+/* The corresponding (negate (negate @0)) -> @0 is in match-plusminus.pd.  */
+
 
 /* End of known transform origin.  Note that some bitwise transforms
are handled in match-constant-folding.pd.  */
 
+/* x & ~0 -> x  */
+(simplify
+ (bit_and @0 integer_all_onesp)
+  @0)
+
 /* ~x & ~y -> ~(x | y) */
 (simplify
   (bit_and (bit_not @0) (bit_not @1))
@@ -149,11 +162,6 @@ along with GCC; see the file COPYING3.
   (bit_ior (bit_not @0) (bit_not @1))
   (bit_not (bit_and @0 @1)))
 
-/* ~~x -> x */
-(simplify
-  (bit_not (bit_not @0))
-  @0)
-
 /* Simple association cases that cancel one operand.  */
 
 /* ((a OP b) & ~a) -> (b & ~a) OP 0  */
Index: gcc/match-plusminus.pd
===
--- gcc/match-plusminus.pd  (revision 215212)
+++ gcc/match-plusminus.pd  (working copy)
@@ -30,6 +30,10 @@ along with GCC; see the file COPYING3.
  (simplify
   (minus @0 (negate @1))
   (plus @0 @1))
+ /* From tree-ssa-forwprop.c:simplify_not_neg_expr.  */
+ (simplify
+  (negate (negate @1))
+  @1)
 
  /* We can't reassociate floating-point or fixed-point plus or minus
 because of saturation to +-Inf.  */
@@ -155,3 +159,13 @@ along with GCC; see the file COPYING3.
   (pointer_plus @0 (negate (bit_and (convert @0) INTEGER_CST@1)))
   (with { tree algn = wide_int_to_tree (TREE_TYPE (@0), wi::bit_not (@1)); }
(bit_and @0 { algn; })))
+
+
+/* From tree-ssa-forwprop.c:simplify_mult.  */
+
+/* (X /[ex] A) * A -> X.  */
+(simplify
+  (mult (convert? (exact_div @0 @1)) @1)
+  /* Look through a sign-changing conversion.  */
+  (if (TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type))
+   (convert @0)))
Index: gcc/match-constant-folding.pd
===
--- gcc/match-constant-folding.pd   (revision 215212)
+++ gcc/match-constant-folding.pd   (working copy)
@@ -60,11 +60,6 @@ along with GCC; see the file COPYING3.
   (bit_ior @0 integer_all_onesp@1)
   @1)
 
-/* x ^ ~0 -> ~x  */
-(simplify
-  (bit_and @0 integer_all_onesp)
-  @0)
-
 /* x & 0 -> 0  */
 (simplify
   (bit_and @0 integer_zerop@1)


Re: ptx preliminary address space fixes [2/4]

2014-09-12 Thread Bernd Schmidt

On 09/11/2014 01:30 PM, Richard Biener wrote:

On Thu, Sep 11, 2014 at 12:12 PM, Bernd Schmidt  wrote:

This is a bug in SRA which replaces a memory reference without taking care
to use the correct address space.

Bootstrapped and tested together with the other patches on x86_64-linux.
Ok?


Ok (with adjustments necessary for renaming apply_as_to_type).


How about this variant with a suitable reference_apply_addr_space?

Index: gcc/tree-sra.c
===
--- gcc/tree-sra.c.orig
+++ gcc/tree-sra.c
@@ -1562,6 +1562,8 @@ build_ref_for_offset (location_t loc, tr
 exp_type = build_aligned_type (exp_type, align);

   mem_ref = fold_build2_loc (loc, MEM_REF, exp_type, base, off);
+  reference_apply_addr_space (mem_ref,
+ TYPE_ADDR_SPACE (TREE_TYPE (prev_base));
   if (TREE_THIS_VOLATILE (prev_base))
 TREE_THIS_VOLATILE (mem_ref) = 1;
   if (TREE_SIDE_EFFECTS (prev_base))


Bernd


Re: ptx preliminary address space fixes [1/4]

2014-09-12 Thread Bernd Schmidt

On 09/11/2014 01:29 PM, Richard Biener wrote:

+  if (TREE_CODE (type) == ARRAY_TYPE)
+TREE_TYPE (type) = apply_as_to_type (TREE_TYPE (type), as);

why is this necessary for ARRAY_TYPE but not for sth like
a RECORD_TYPE or a POINTER_TYPE?


Still testing whether I actually strictly need it for ARRAY_TYPE 
nowadays (these patches are really old...). However, the TYPE_FIELDS of 
a RECORD_TYPE seem to be mostly ignored once the frontends are done, but 
it's very easy for other parts of the compiler to take the TREE_TYPE of 
an ARRAY_TYPE. Fixing that up is simple and seems like a good thing to 
do for consistency (I notice that maybe I should add VECTOR_TYPE).


For a POINTER_TYPE, it is correct not to modify the pointed-to type. We 
want to express that a variable of that pointer type lives in an address 
space, not that the pointed-to type is different.



The name apply_as_to_type looks odd to me - other address-space
related functions use addr_space - can you change it to that please?


Will change, and update the other patches accordingly.


Bernd



Re: ptx preliminary address space fixes [3/4]

2014-09-12 Thread Bernd Schmidt

On 09/11/2014 01:39 PM, Richard Biener wrote:


Seeing this it would be nice to abstract away the exact place we store
the address-space in a memory reference.  So - can you add a helper
reference_addr_space () please?  Thus do

   addr_space_t as = reference_addr_space (scalar_dest);


Ok, no problem.


but then I wonder why not simply build the correct vector types
in the first place in vect_analyze_data_refs?


We do, kind of. But there are two problems, the first one addressed here 
in the patch:
- data_ref = build2 (MEM_REF, TREE_TYPE (vec_oprnd), 
dataref_ptr,

+ data_ref = build2 (MEM_REF, vectype, dataref_ptr,

We use the wrong vector type (vectype has the right address space, but 
vec_oprnd is an SSA_NAME).


The second problem is our reprentation of vector types. When given
   unsigned int
make_vector_type makes
   vector(2) unsigned int
rather than
   vector(2)  unsigned int

Since we use elem_type in a few ways in tree_vectorizable_stmt, that 
looks like it would cause problems, but it probably can be addressed in 
a simpler way than what I originally had:


Index: gomp-4_0-branch/gcc/tree-vect-stmts.c
===
--- gomp-4_0-branch.orig/gcc/tree-vect-stmts.c
+++ gomp-4_0-branch/gcc/tree-vect-stmts.c
@@ -5037,7 +5037,8 @@ vectorizable_store (gimple stmt, gimple_
   return false;
 }

-  elem_type = TREE_TYPE (vectype);
+  elem_type = apply_addr_space_to_type (TREE_TYPE (vectype),
+   TYPE_ADDR_SPACE (vectype));
   vec_mode = TYPE_MODE (vectype);

   /* FORNOW. In some cases can vectorize even if data-type not supported
@@ -5692,7 +5693,8 @@ vectorizable_load (gimple stmt, gimple_s
   if (!STMT_VINFO_DATA_REF (stmt_info))
 return false;

-  elem_type = TREE_TYPE (vectype);
+  elem_type = apply_addr_space_to_type (TREE_TYPE (vectype),
+   TYPE_ADDR_SPACE (vectype));
   mode = TYPE_MODE (vectype);

   /* FORNOW. In some cases can vectorize even if data-type not supported

The other alternative is to fix up make_vector_type. That could look 
like this:


Index: gomp-4_0-branch/gcc/tree.c
===
--- gomp-4_0-branch.orig/gcc/tree.c
+++ gomp-4_0-branch/gcc/tree.c
@@ -9470,9 +9470,13 @@ make_vector_type (tree innertype, int nu
  inner type. Use it to build the variant we return.  */
   if ((TYPE_ATTRIBUTES (innertype) || TYPE_QUALS (innertype))
   && TREE_TYPE (t) != innertype)
-return build_type_attribute_qual_variant (t,
- TYPE_ATTRIBUTES (innertype),
- TYPE_QUALS (innertype));
+{
+  t = build_type_attribute_qual_variant (t,
+TYPE_ATTRIBUTES (innertype),
+TYPE_QUALS (innertype));
+  TREE_TYPE (t) = apply_addr_space_to_type (TREE_TYPE (t),
+   TYPE_ADDR_SPACE (innertype));
+}

   return t;
 }

Let me know what you prefer.


Or apply the addr-space to the memory reference with a new helper
reference_apply_addr_space

- data_ref = build2 (MEM_REF, TREE_TYPE (vec_oprnd), dataref_ptr,
  dataref_offset
  ? dataref_offset
  : build_int_cst (reference_alias_ptr_type
..
  reference_apply_addr_space (data_ref, as);

at least that's how it's abstracted on the RTL side.  I think I'd prefer
if things would be working that way on the tree level, too.


I'm adapting my other patches to use such a function, but in this 
particular case I think it would be best to fix up the types in advance 
since we build accesses in several places and especially 
vectorizable_load also seems to create operations on the pointer type. 
Are you ok with that?



Bernd



Re: [PATCH][match-and-simplify] Finish simplify_bitwise_binary patterns

2014-09-12 Thread Marc Glisse

On Fri, 12 Sep 2014, Richard Biener wrote:


+/* x ^ ~0 -> ~x  */
(simplify
  (bit_and @0 integer_all_onesp)
  @0)


The comment doesn't seem to match.

--
Marc Glisse


[PATCH][match-and-simplify] Fix lto.exp=20110201-1_0.c ICE

2014-09-12 Thread Richard Biener

Committed.

Richard.

2014-09-12  Richard Biener  

* genmatch.c (expr::gen_transform): Properly autocompute
type of REALPART_EXPR and IMAGPART_EXPR.

Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 215210)
+++ gcc/genmatch.c  (working copy)
@@ -898,7 +898,7 @@ expr::gen_transform (FILE *f, const char
 {
   bool conversion_p = is_conversion (operation->op);
   const char *type = expr_type;
-  char optype[20];
+  char optype[64];
   if (type)
 /* If there was a type specification in the pattern use it.  */
 ;
@@ -906,6 +906,15 @@ expr::gen_transform (FILE *f, const char
 /* For conversions we need to build the expression using the
outer type passed in.  */
 type = in_type;
+  else if (*operation->op == REALPART_EXPR
+  || *operation->op == IMAGPART_EXPR)
+{
+  /* __real and __imag use the component type of its operand.  */
+  sprintf (optype, "TREE_TYPE (TREE_TYPE (ops%d[0]))", depth);
+  type = optype;
+  /* Avoid passing in_type / type to operand creation.  */
+  conversion_p = true;
+}
   else
 {
   /* Other operations are of the same type as their first operand.  */


[PATCH][match-and-simplify] Finish simplify_bitwise_binary patterns

2014-09-12 Thread Richard Biener

And some more.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2014-09-12  Richard Biener  

* match-bitwise.pd: Complete tree-ssa-forwprop.c patterns
from simplify_bitwise_binary.  Implement some from fold_binary.
* match-constant-folding.pd: Add some comments.
* match-plusminus.pd (~A + 1 -> -A): Fix COMPLEX_TYPE handling.

Index: gcc/match-bitwise.pd
===
--- gcc/match-bitwise.pd(revision 215009)
+++ gcc/match-bitwise.pd(working copy)
@@ -17,86 +17,211 @@ You should have received a copy of the G
 along with GCC; see the file COPYING3.  If not see
 .  */
 
-/* TODO bitwise patterns:
-1] x & x -> x
-2] x & 0 -> 0
-3] x & -1 -> x
-4] x & ~x -> 0
-5] ~x & ~y -> ~(x | y)
-6] ~x | ~y -> ~(x & y)
-7] x & (~x | y) -> y & x
-8] (x | CST1) & CST2  ->  (x & CST2) | (CST1 & CST2)
-9] x ^ x -> 0
-10] x ^ ~0 -> ~x
-11] (x | y) & x -> x
-12] (x & y) | x -> x
-13] (~x | y) & x -> x & y
-14] (~x & y) | x -> x | y
-15] ((a & b) & ~a) & ~b -> 0
-16] ~~x -> x
-*/
-
-/* x & x -> x */
-(simplify
-  (bit_and integral_op_p@0 @0)
-  @0)
-
-/* x & ~x -> 0 */
-(simplify
-  (bit_and:c integral_op_p@0 (bit_not @0))
-  { build_int_cst (type, 0); })
-
-/* ~x & ~y -> ~(x | y) */
-(simplify
-  (bit_and (bit_not integral_op_p@0) (bit_not @1))
-  (bit_not (bit_ior @0 @1)))
-
-/* ~x | ~y -> ~(x & y) */
-(simplify
-  (bit_ior (bit_not integral_op_p@0) (bit_not @1))
-  (bit_not (bit_and @0 @1)))
-
-/* x & (~x | y) -> y & x */
-(simplify
-  (bit_and:c integral_op_p@0 (bit_ior:c (bit_not @0) @1))
-  (bit_and @1 @0))
+/* Try to fold (type) X op CST -> (type) (X op ((type-x) CST))
+   when profitable.  */
+/* For bitwise binary operations apply operand conversions to the
+   binary operation result instead of to the operands.  This allows
+   to combine successive conversions and bitwise binary operations.  */
+/* We combine the above two cases by using a conditional convert.  */
+(for bitop (bit_and bit_ior bit_xor)
+ (simplify
+  (bitop (convert @0) (convert? @1))
+  (if (((TREE_CODE (@1) == INTEGER_CST
+&& INTEGRAL_TYPE_P (TREE_TYPE (@0))
+&& int_fits_type_p (@1, TREE_TYPE (@0)))
+   || types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
+   && (/* That's a good idea if the conversion widens the operand, thus
+ after hoisting the conversion the operation will be narrower.  */
+  TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
+  /* It's also a good idea if the conversion is to a non-integer
+ mode.  */
+  || GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT
+  /* Or if the precision of TO is not the same as the precision
+ of its mode.  */
+  || TYPE_PRECISION (type) != GET_MODE_PRECISION (TYPE_MODE (type
+   (convert (bitop @0 (convert @1))
+
+/* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
+(for bitop (bit_and bit_ior bit_xor)
+ (simplify
+  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
+  (bit_and (bitop @0 @2) @1)))
 
 /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
 (simplify
-  (bit_and (bit_ior integral_op_p@0 INTEGER_CST@1) INTEGER_CST@2)
+  (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
   (bit_ior (bit_and @0 @2) (bit_and @1 @2)))
 
-/* x ^ ~0 -> ~x */
+/* Combine successive equal operations with constants.  */
+(for bitop (bit_and bit_ior bit_xor)
+ (simplify
+  (bitop (bitop @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
+  (bitop @0 (bitop @1 @2
+
+/* Canonicalize X ^ ~0 to ~X.  */
 (simplify
   (bit_xor @0 integer_all_onesp@1)
   (bit_not @0))
 
-/* (x | y) & x -> x */
-(simplify
-  (bit_and:c (bit_ior integral_op_p@0 @1) @0)
+/* Try simple folding for X op !X, and X op X.
+   From tree-ssa-forwprop.c:simplify_bitwise_binary_1.  */
+/* x & x -> x,  x | x -> x  */
+(for bitop (bit_and bit_ior)
+ (simplify
+  (bitop @0 @0)
+  @0))
+/* X & ~X -> 0.  */
+(simplify
+  (bit_and:c @0 (bit_not @0))
+  { build_zero_cst (type); })
+/* X | ~X -> ~0,  X ^ ~X -> ~0.  */
+(for bitop (bit_ior bit_xor)
+ (simplify
+  (bitop:c @0 (bit_not @0))
+  { build_all_ones_cst (type); }))
+/* ???  The following ones are suspicious and want generalization.
+   Also X != 1 vs. X ^ 1 vs ~X wants canonicalization for truth
+   values.  */
+#if 0  /* FIXME.  truth_valued_ssa_name isn't exported either.  */
+(for bitop (bit_and bit_ior bit_xor)
+ /* X & (X == 0) -> 0.  */
+ /* X | (X == 0) -> 1.  */
+ (simplify
+  (bitop:c @0 (eq @0 integer_zerop))
+  (if (truth_valued_ssa_name (@0))
+   { bitop == BIT_AND ? build_zero_cst (type) : build_one_cst (type); }))
+ /* X & (X != 1) -> 0,  X & (X ^ 1) -> 0 for truth-valued X.  */
+ /* X | (X != 1) -> 1,  X | (X ^ 1) -> 1 for truth-valued X.  */
+ (for op (ne bit_xor)
+  (simplify
+   (bitop:c @0 (op @0 integer_onep))
+   (if (truth_valued_ssa_name (@0))
+{ bitop == BIT_AND ? build_zero_cst (type) : build_one_cst (type

Re: integer_each_onep + vector bug

2014-09-12 Thread Richard Biener
On Fri, Sep 12, 2014 at 12:04 PM, Marc Glisse  wrote:
> Hello,
>
> here is a new predicate which tests if a number is 1, or (for vector and
> complex) a collection of 1. The only difference with integer_onep should be
> for complex where it wants 1+i and not 1. The main use would be in the match
> branch, so I didn't waste too much time adding many uses, I just added a
> couple (so the function won't be garbage collected in some refactoring)
> though I wasn't able to create a testcase (for complex int, '&' or '^' are
> rejected and '~' means conjugate).
>
> While looking for potential uses for integer_each_onep, I couldn't help
> noticing a few wrong optimizations for vectors, that I am fixing at the same
> time.

Yeah, the transforms you disable are questionable for GIMPLE anyway
as (X & 1) == 0 needs to have boolean type there and we like the
other forms more.  This looks like a transform targeted at RTL expansion
to me (or one that is profitable if the expression feeds a conditional
which we can merge it into - like the (~X & Y) -> X < Y transforms
tree-ssa-forwprop.c does.

Ok.

Thanks,
Richard.

>
> Bootstrap+testsuite on x86_64-linux-gnu.
>
>
> 2014-09-12  Marc Glisse  
>
> gcc/
> * tree.c (integer_each_onep): New function.
> * tree.h (integer_each_onep): Declare it.
> * fold-const.c (fold_binary_loc): Use it for ~A + 1 to -A and
> -A - 1 to ~A.  Disable (X & 1) ^ 1, (X ^ 1) & 1 and ~X & 1 to
> (X & 1) == 0 for vector and complex.
> gcc/testsuite/
> * gcc.dg/vec-andxor1.c: New file.
>
>
> --
> Marc Glisse
> Index: fold-const.c
> ===
> --- fold-const.c(revision 215179)
> +++ fold-const.c(working copy)
> @@ -10085,21 +10085,21 @@ fold_binary_loc (location_t loc,
>   && (flag_sanitize & SANITIZE_SI_OVERFLOW) == 0)
> return fold_build2_loc (loc, MINUS_EXPR, type,
> fold_convert_loc (loc, type, arg1),
> fold_convert_loc (loc, type,
>   TREE_OPERAND (arg0, 0)));
>
>if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
> {
>   /* Convert ~A + 1 to -A.  */
>   if (TREE_CODE (arg0) == BIT_NOT_EXPR
> - && integer_onep (arg1))
> + && integer_each_onep (arg1))
> return fold_build1_loc (loc, NEGATE_EXPR, type,
> fold_convert_loc (loc, type,
>   TREE_OPERAND (arg0, 0)));
>
>   /* ~X + X is -1.  */
>   if (TREE_CODE (arg0) == BIT_NOT_EXPR
>   && !TYPE_OVERFLOW_TRAPS (type))
> {
>   tree tem = TREE_OPERAND (arg0, 0);
>
> @@ -10612,23 +10612,22 @@ fold_binary_loc (location_t loc,
>/* (-A) - B -> (-B) - A  where B is easily negated and we can swap.
> */
>if (TREE_CODE (arg0) == NEGATE_EXPR
>   && negate_expr_p (arg1)
>   && reorder_operands_p (arg0, arg1))
> return fold_build2_loc (loc, MINUS_EXPR, type,
> fold_convert_loc (loc, type,
>   negate_expr (arg1)),
> fold_convert_loc (loc, type,
>   TREE_OPERAND (arg0, 0)));
>/* Convert -A - 1 to ~A.  */
> -  if (TREE_CODE (type) != COMPLEX_TYPE
> - && TREE_CODE (arg0) == NEGATE_EXPR
> - && integer_onep (arg1)
> +  if (TREE_CODE (arg0) == NEGATE_EXPR
> + && integer_each_onep (arg1)
>   && !TYPE_OVERFLOW_TRAPS (type))
> return fold_build1_loc (loc, BIT_NOT_EXPR, type,
> fold_convert_loc (loc, type,
>   TREE_OPERAND (arg0, 0)));
>
>/* Convert -1 - A to ~A.  */
>if (TREE_CODE (type) != COMPLEX_TYPE
>   && integer_all_onesp (arg0))
> return fold_build1_loc (loc, BIT_NOT_EXPR, type, op1);
>
> @@ -11377,20 +11376,21 @@ fold_binary_loc (location_t loc,
>/* Convert ~X ^ C to X ^ ~C.  */
>if (TREE_CODE (arg0) == BIT_NOT_EXPR
>   && TREE_CODE (arg1) == INTEGER_CST)
> return fold_build2_loc (loc, code, type,
> fold_convert_loc (loc, type,
>   TREE_OPERAND (arg0, 0)),
> fold_build1_loc (loc, BIT_NOT_EXPR, type,
> arg1));
>
>/* Fold (X & 1) ^ 1 as (X & 1) == 0.  */
>if (TREE_CODE (arg0) == BIT_AND_EXPR
> + && INTEGRAL_TYPE_P (type)
>   && integer_onep (TREE_OPERAND (arg0, 1))
>   && integer_onep (arg1))
> return fold_build2_loc (loc, EQ_EXPR, type, arg0,
> build_zero_cst (TREE_TYPE (arg0)));
>
>/* Fold (X & Y) ^ Y as ~X & Y.  */
>if (TREE_CODE (arg0) == BIT_AND_EXPR
>   && ope

Re: [ARM][tests] Make input and output arrays 128-bit aligned in vectorisation tests

2014-09-12 Thread Ramana Radhakrishnan



On 12/09/14 11:13, Richard Earnshaw wrote:

On 09/09/14 16:14, Kyrill Tkachov wrote:

Hi all,

As Christophe mentioned at
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00202.html
These tests fail on big-endian. The reason is that the input is not
aligned to 128 bit forcing the use of a movmisalign which we don't
support on big-endian.

A solution is to force the alignment of the arrays, allowing for the use
of normal loads and stores.
We can look into enabling misaligned loads on big-endian with the
appropriate reversal logic as a separate
piece of work...



So surely these tests *ought* to pass on big-endian without this change
and the fact that they don't is a bug.


The backend disables misaligned loads in neon for big endian. You can't 
do a vld1.8 on a movmisalignv2si and expect it to work correctly, can you ?




So in that case, surely we should be (at most XFAILing them), rather
than papering over a real problem.



I disagree - The tests are to detect vectorization of lceilf, lfloor and 
all the other functions.


So until the backend is fixed, I asked for the tests to be changed to 
test what they are supposed to - vectorizing calls to these builtin 
functions which is more useful than XFAILing these.


regards
Ramana



R.


Ok for trunk?

2014-09-09  Kyrylo Tkachov  

  * gcc.target/arm/vect-lceilf_1.c: Make input and output arrays global
  and 16-byte aligned.
  * gcc.target/arm/vect-lfloorf_1.c: Likewise.
  * gcc.target/arm/vect-lroundf_1.c: Likewise.
  * gcc.target/arm/vect-rounding-btruncf.c: Likewise.
  * gcc.target/arm/vect-rounding-ceilf.c: Likewise.
  * gcc.target/arm/vect-rounding-floorf.c: Likewise.
  * gcc.target/arm/vect-rounding-roundf.c: Likewise.


arm-vect-tests-align.patch


diff --git a/gcc/testsuite/gcc.target/arm/vect-lceilf_1.c 
b/gcc/testsuite/gcc.target/arm/vect-lceilf_1.c
index 75705ae..5e98b74 100644
--- a/gcc/testsuite/gcc.target/arm/vect-lceilf_1.c
+++ b/gcc/testsuite/gcc.target/arm/vect-lceilf_1.c
@@ -5,8 +5,11 @@

  #define N 32

+float __attribute__((aligned(16))) input[N];
+int __attribute__((aligned(16))) output[N];
+
  void
-foo (int *output, float *input)
+foo ()
  {
int i = 0;
/* Vectorizable.  */
diff --git a/gcc/testsuite/gcc.target/arm/vect-lfloorf_1.c 
b/gcc/testsuite/gcc.target/arm/vect-lfloorf_1.c
index 298d54e..655f437 100644
--- a/gcc/testsuite/gcc.target/arm/vect-lfloorf_1.c
+++ b/gcc/testsuite/gcc.target/arm/vect-lfloorf_1.c
@@ -5,8 +5,11 @@

  #define N 32

+float __attribute__((aligned(16))) input[N];
+int __attribute__((aligned(16))) output[N];
+
  void
-foo (int *output, float *input)
+foo ()
  {
int i = 0;
/* Vectorizable.  */
diff --git a/gcc/testsuite/gcc.target/arm/vect-lroundf_1.c 
b/gcc/testsuite/gcc.target/arm/vect-lroundf_1.c
index 6443821..92a722e 100644
--- a/gcc/testsuite/gcc.target/arm/vect-lroundf_1.c
+++ b/gcc/testsuite/gcc.target/arm/vect-lroundf_1.c
@@ -5,8 +5,11 @@

  #define N 32

+float __attribute__((aligned(16))) input[N];
+int __attribute__((aligned(16))) output[N];
+
  void
-foo (int *output, float *input)
+foo ()
  {
int i = 0;
/* Vectorizable.  */
diff --git a/gcc/testsuite/gcc.target/arm/vect-rounding-btruncf.c 
b/gcc/testsuite/gcc.target/arm/vect-rounding-btruncf.c
index 5616837..372ddc5 100644
--- a/gcc/testsuite/gcc.target/arm/vect-rounding-btruncf.c
+++ b/gcc/testsuite/gcc.target/arm/vect-rounding-btruncf.c
@@ -5,8 +5,11 @@

  #define N 32

+float __attribute__((aligned(16))) input[N];
+float __attribute__((aligned(16))) output[N];
+
  void
-foo (float *output, float *input)
+foo ()
  {
int i = 0;
/* Vectorizable.  */
diff --git a/gcc/testsuite/gcc.target/arm/vect-rounding-ceilf.c 
b/gcc/testsuite/gcc.target/arm/vect-rounding-ceilf.c
index cb8f1d5..3c786d4 100644
--- a/gcc/testsuite/gcc.target/arm/vect-rounding-ceilf.c
+++ b/gcc/testsuite/gcc.target/arm/vect-rounding-ceilf.c
@@ -5,8 +5,11 @@

  #define N 32

+float __attribute__((aligned(16))) input[N];
+float __attribute__((aligned(16))) output[N];
+
  void
-foo (float *output, float *input)
+foo ()
  {
int i = 0;
/* Vectorizable.  */
diff --git a/gcc/testsuite/gcc.target/arm/vect-rounding-floorf.c 
b/gcc/testsuite/gcc.target/arm/vect-rounding-floorf.c
index bf68af7..eedb295 100644
--- a/gcc/testsuite/gcc.target/arm/vect-rounding-floorf.c
+++ b/gcc/testsuite/gcc.target/arm/vect-rounding-floorf.c
@@ -5,8 +5,11 @@

  #define N 32

+float __attribute__((aligned(16))) input[N];
+float __attribute__((aligned(16))) output[N];
+
  void
-foo (float *output, float *input)
+foo ()
  {
int i = 0;
/* Vectorizable.  */
diff --git a/gcc/testsuite/gcc.target/arm/vect-rounding-roundf.c 
b/gcc/testsuite/gcc.target/arm/vect-rounding-roundf.c
index 7c0a1b4..360b2b9 100644
--- a/gcc/testsuite/gcc.target/arm/vect-rounding-roundf.c
+++ b/gcc/testsuite/gcc.target/arm/vect-rounding-roundf.c
@@ -5,8 +5,11 @@

  #define N 32

+float __attribute__((aligned(16))) input[N];
+float __attribute__((aligned(16))) ou

Re: [PATCH] gcc parallel make check

2014-09-12 Thread Jonathan Wakely

On 12/09/14 09:47 +, VandeVondele  Joost wrote:

a newer patch (v8) I'll send soon


attached with updated changelog. Compared to the previously posted v6, only the 
libstdc++-v3/testsuite/Makefile.am has been refined to split a little more the 
e*/* pattern, and two quickly running goal have been merged, in addition to 
fixing the pre-exisiting error in some of the patterns in that file.


The libstdc++ part is OK, thanks.



Re: [ARM][tests] Make input and output arrays 128-bit aligned in vectorisation tests

2014-09-12 Thread Richard Earnshaw
On 09/09/14 16:14, Kyrill Tkachov wrote:
> Hi all,
> 
> As Christophe mentioned at 
> https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00202.html
> These tests fail on big-endian. The reason is that the input is not 
> aligned to 128 bit forcing the use of a movmisalign which we don't 
> support on big-endian.
> 
> A solution is to force the alignment of the arrays, allowing for the use 
> of normal loads and stores.
> We can look into enabling misaligned loads on big-endian with the 
> appropriate reversal logic as a separate
> piece of work...
> 

So surely these tests *ought* to pass on big-endian without this change
and the fact that they don't is a bug.

So in that case, surely we should be (at most XFAILing them), rather
than papering over a real problem.

R.

> Ok for trunk?
> 
> 2014-09-09  Kyrylo Tkachov  
> 
>  * gcc.target/arm/vect-lceilf_1.c: Make input and output arrays global
>  and 16-byte aligned.
>  * gcc.target/arm/vect-lfloorf_1.c: Likewise.
>  * gcc.target/arm/vect-lroundf_1.c: Likewise.
>  * gcc.target/arm/vect-rounding-btruncf.c: Likewise.
>  * gcc.target/arm/vect-rounding-ceilf.c: Likewise.
>  * gcc.target/arm/vect-rounding-floorf.c: Likewise.
>  * gcc.target/arm/vect-rounding-roundf.c: Likewise.
> 
> 
> arm-vect-tests-align.patch
> 
> 
> diff --git a/gcc/testsuite/gcc.target/arm/vect-lceilf_1.c 
> b/gcc/testsuite/gcc.target/arm/vect-lceilf_1.c
> index 75705ae..5e98b74 100644
> --- a/gcc/testsuite/gcc.target/arm/vect-lceilf_1.c
> +++ b/gcc/testsuite/gcc.target/arm/vect-lceilf_1.c
> @@ -5,8 +5,11 @@
>  
>  #define N 32
>  
> +float __attribute__((aligned(16))) input[N];
> +int __attribute__((aligned(16))) output[N];
> +
>  void
> -foo (int *output, float *input)
> +foo ()
>  {
>int i = 0;
>/* Vectorizable.  */
> diff --git a/gcc/testsuite/gcc.target/arm/vect-lfloorf_1.c 
> b/gcc/testsuite/gcc.target/arm/vect-lfloorf_1.c
> index 298d54e..655f437 100644
> --- a/gcc/testsuite/gcc.target/arm/vect-lfloorf_1.c
> +++ b/gcc/testsuite/gcc.target/arm/vect-lfloorf_1.c
> @@ -5,8 +5,11 @@
>  
>  #define N 32
>  
> +float __attribute__((aligned(16))) input[N];
> +int __attribute__((aligned(16))) output[N];
> +
>  void
> -foo (int *output, float *input)
> +foo ()
>  {
>int i = 0;
>/* Vectorizable.  */
> diff --git a/gcc/testsuite/gcc.target/arm/vect-lroundf_1.c 
> b/gcc/testsuite/gcc.target/arm/vect-lroundf_1.c
> index 6443821..92a722e 100644
> --- a/gcc/testsuite/gcc.target/arm/vect-lroundf_1.c
> +++ b/gcc/testsuite/gcc.target/arm/vect-lroundf_1.c
> @@ -5,8 +5,11 @@
>  
>  #define N 32
>  
> +float __attribute__((aligned(16))) input[N];
> +int __attribute__((aligned(16))) output[N];
> +
>  void
> -foo (int *output, float *input)
> +foo ()
>  {
>int i = 0;
>/* Vectorizable.  */
> diff --git a/gcc/testsuite/gcc.target/arm/vect-rounding-btruncf.c 
> b/gcc/testsuite/gcc.target/arm/vect-rounding-btruncf.c
> index 5616837..372ddc5 100644
> --- a/gcc/testsuite/gcc.target/arm/vect-rounding-btruncf.c
> +++ b/gcc/testsuite/gcc.target/arm/vect-rounding-btruncf.c
> @@ -5,8 +5,11 @@
>  
>  #define N 32
>  
> +float __attribute__((aligned(16))) input[N];
> +float __attribute__((aligned(16))) output[N];
> +
>  void
> -foo (float *output, float *input)
> +foo ()
>  {
>int i = 0;
>/* Vectorizable.  */
> diff --git a/gcc/testsuite/gcc.target/arm/vect-rounding-ceilf.c 
> b/gcc/testsuite/gcc.target/arm/vect-rounding-ceilf.c
> index cb8f1d5..3c786d4 100644
> --- a/gcc/testsuite/gcc.target/arm/vect-rounding-ceilf.c
> +++ b/gcc/testsuite/gcc.target/arm/vect-rounding-ceilf.c
> @@ -5,8 +5,11 @@
>  
>  #define N 32
>  
> +float __attribute__((aligned(16))) input[N];
> +float __attribute__((aligned(16))) output[N];
> +
>  void
> -foo (float *output, float *input)
> +foo ()
>  {
>int i = 0;
>/* Vectorizable.  */
> diff --git a/gcc/testsuite/gcc.target/arm/vect-rounding-floorf.c 
> b/gcc/testsuite/gcc.target/arm/vect-rounding-floorf.c
> index bf68af7..eedb295 100644
> --- a/gcc/testsuite/gcc.target/arm/vect-rounding-floorf.c
> +++ b/gcc/testsuite/gcc.target/arm/vect-rounding-floorf.c
> @@ -5,8 +5,11 @@
>  
>  #define N 32
>  
> +float __attribute__((aligned(16))) input[N];
> +float __attribute__((aligned(16))) output[N];
> +
>  void
> -foo (float *output, float *input)
> +foo ()
>  {
>int i = 0;
>/* Vectorizable.  */
> diff --git a/gcc/testsuite/gcc.target/arm/vect-rounding-roundf.c 
> b/gcc/testsuite/gcc.target/arm/vect-rounding-roundf.c
> index 7c0a1b4..360b2b9 100644
> --- a/gcc/testsuite/gcc.target/arm/vect-rounding-roundf.c
> +++ b/gcc/testsuite/gcc.target/arm/vect-rounding-roundf.c
> @@ -5,8 +5,11 @@
>  
>  #define N 32
>  
> +float __attribute__((aligned(16))) input[N];
> +float __attribute__((aligned(16))) output[N];
> +
>  void
> -foo (float *output, float *input)
> +foo ()
>  {
>int i = 0;
>/* Vectorizable.  */
> 




integer_each_onep + vector bug

2014-09-12 Thread Marc Glisse

Hello,

here is a new predicate which tests if a number is 1, or (for vector and 
complex) a collection of 1. The only difference with integer_onep should 
be for complex where it wants 1+i and not 1. The main use would be in the 
match branch, so I didn't waste too much time adding many uses, I just 
added a couple (so the function won't be garbage collected in some 
refactoring) though I wasn't able to create a testcase (for complex int, 
'&' or '^' are rejected and '~' means conjugate).


While looking for potential uses for integer_each_onep, I couldn't help 
noticing a few wrong optimizations for vectors, that I am fixing at the 
same time.


Bootstrap+testsuite on x86_64-linux-gnu.


2014-09-12  Marc Glisse  

gcc/
* tree.c (integer_each_onep): New function.
* tree.h (integer_each_onep): Declare it.
* fold-const.c (fold_binary_loc): Use it for ~A + 1 to -A and
-A - 1 to ~A.  Disable (X & 1) ^ 1, (X ^ 1) & 1 and ~X & 1 to
(X & 1) == 0 for vector and complex.
gcc/testsuite/
* gcc.dg/vec-andxor1.c: New file.


--
Marc GlisseIndex: fold-const.c
===
--- fold-const.c(revision 215179)
+++ fold-const.c(working copy)
@@ -10085,21 +10085,21 @@ fold_binary_loc (location_t loc,
  && (flag_sanitize & SANITIZE_SI_OVERFLOW) == 0)
return fold_build2_loc (loc, MINUS_EXPR, type,
fold_convert_loc (loc, type, arg1),
fold_convert_loc (loc, type,
  TREE_OPERAND (arg0, 0)));
 
   if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
{
  /* Convert ~A + 1 to -A.  */
  if (TREE_CODE (arg0) == BIT_NOT_EXPR
- && integer_onep (arg1))
+ && integer_each_onep (arg1))
return fold_build1_loc (loc, NEGATE_EXPR, type,
fold_convert_loc (loc, type,
  TREE_OPERAND (arg0, 0)));
 
  /* ~X + X is -1.  */
  if (TREE_CODE (arg0) == BIT_NOT_EXPR
  && !TYPE_OVERFLOW_TRAPS (type))
{
  tree tem = TREE_OPERAND (arg0, 0);
 
@@ -10612,23 +10612,22 @@ fold_binary_loc (location_t loc,
   /* (-A) - B -> (-B) - A  where B is easily negated and we can swap.  */
   if (TREE_CODE (arg0) == NEGATE_EXPR
  && negate_expr_p (arg1)
  && reorder_operands_p (arg0, arg1))
return fold_build2_loc (loc, MINUS_EXPR, type,
fold_convert_loc (loc, type,
  negate_expr (arg1)),
fold_convert_loc (loc, type,
  TREE_OPERAND (arg0, 0)));
   /* Convert -A - 1 to ~A.  */
-  if (TREE_CODE (type) != COMPLEX_TYPE
- && TREE_CODE (arg0) == NEGATE_EXPR
- && integer_onep (arg1)
+  if (TREE_CODE (arg0) == NEGATE_EXPR
+ && integer_each_onep (arg1)
  && !TYPE_OVERFLOW_TRAPS (type))
return fold_build1_loc (loc, BIT_NOT_EXPR, type,
fold_convert_loc (loc, type,
  TREE_OPERAND (arg0, 0)));
 
   /* Convert -1 - A to ~A.  */
   if (TREE_CODE (type) != COMPLEX_TYPE
  && integer_all_onesp (arg0))
return fold_build1_loc (loc, BIT_NOT_EXPR, type, op1);
 
@@ -11377,20 +11376,21 @@ fold_binary_loc (location_t loc,
   /* Convert ~X ^ C to X ^ ~C.  */
   if (TREE_CODE (arg0) == BIT_NOT_EXPR
  && TREE_CODE (arg1) == INTEGER_CST)
return fold_build2_loc (loc, code, type,
fold_convert_loc (loc, type,
  TREE_OPERAND (arg0, 0)),
fold_build1_loc (loc, BIT_NOT_EXPR, type, arg1));
 
   /* Fold (X & 1) ^ 1 as (X & 1) == 0.  */
   if (TREE_CODE (arg0) == BIT_AND_EXPR
+ && INTEGRAL_TYPE_P (type)
  && integer_onep (TREE_OPERAND (arg0, 1))
  && integer_onep (arg1))
return fold_build2_loc (loc, EQ_EXPR, type, arg0,
build_zero_cst (TREE_TYPE (arg0)));
 
   /* Fold (X & Y) ^ Y as ~X & Y.  */
   if (TREE_CODE (arg0) == BIT_AND_EXPR
  && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
{
  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
@@ -11487,33 +11487,35 @@ fold_binary_loc (location_t loc,
  && reorder_operands_p (arg0, TREE_OPERAND (arg1, 1)))
return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 1));
   /* X & (Y | X) is (Y, X).  */
   if (TREE_CODE (arg1) == BIT_IOR_EXPR
  && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
  && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
 
   /*

Re: [wwwdocs] Buildstat update for 4.8

2014-09-12 Thread Gerald Pfeifer
On Fri, 5 Sep 2014, Tom G. Christensen wrote:
> No new testresults just a small fixup for the update from Raghunath
> Lolur.
> His aarch64 build was cross built though the testsuite seems to have
> been run using qemu (host=target).
> I'm uncertain if this qualifies for mentioning the build host in the
> second column, but for now the just posted 4.9 update adds it for his
> recent build and so does this patch.

This is a bit of a special case indeed, but I went with your
recommendation.  Good catch.

GeraldIndex: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/buildstat.html,v
retrieving revision 1.11
diff -u -r1.11 buildstat.html
--- buildstat.html  25 Aug 2014 07:42:07 -  1.11
+++ buildstat.html  5 Sep 2014 14:54:09 -
@@ -24,9 +24,9 @@
 
 
 aarch64-unknown-linux-gnu
- 
+i686-pc-linux-gnu
 Test results:
-https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01057.html";>4.8.3
+https://gcc.gnu.org/ml/gcc-testresults/2014-07/msg01057.html";>4.8.3,
 https://gcc.gnu.org/ml/gcc-testresults/2014-03/msg02123.html";>4.8.2
 
 


RE: [PATCH] gcc parallel make check

2014-09-12 Thread VandeVondele Joost
> a newer patch (v8) I'll send soon

attached with updated changelog. Compared to the previously posted v6, only the 
libstdc++-v3/testsuite/Makefile.am has been refined to split a little more the 
e*/* pattern, and two quickly running goal have been merged, in addition to 
fixing the pre-exisiting error in some of the patterns in that file.

Checked comparing testsuite results before after. 

Obviously, if Jakub's patch can be made to work around the testsuite special 
cases, I believe it should be superior. If not, the attached patch is working 
as far as I can tell, and provides a significant improvement over current trunk.

Joostcontrib/ChangeLog

2014-09-12  Joost VandeVondele  

* generate_tcl_patterns.sh: New file.

gcc/fortran/ChangeLog

2014-09-12  Joost VandeVondele  

* Make-lang.in (check_gfortran_parallelize): Improved parallelism.

gcc/Changelog

2014-09-12  Joost VandeVondele  

* Makefile.in (check_gcc_parallelize): Improved parallelism.
(check_p_numbers): Increase maximum value.
(dg_target_exps): Mention targets as separate words only.
(null,space,comma,dg_target_exps_p1,dg_target_exps_p2,
dg_target_exps_p3,dg_target_exps_p4): New variables.

gcc/cp/ChangeLog

2014-09-12  Joost VandeVondele  

* Make-lang.in (check_g++_parallelize): Improved parallelism.

libstdc++-v3/ChangeLog

2014-09-12  Joost VandeVondele  

* testsuite/Makefile.am (check_DEJAGNU_normal_targets): Add
check-DEJAGNUnormal[11-15].
(check-DEJAGNU): Split into 15 jobs for parallel testing, correct 
pattern.
* testsuite/Makefile.in: Regenerated.
Index: libstdc++-v3/testsuite/Makefile.in
===
--- libstdc++-v3/testsuite/Makefile.in	(revision 215147)
+++ libstdc++-v3/testsuite/Makefile.in	(working copy)
@@ -301,7 +301,7 @@ lists_of_files = \
 
 extract_symvers = $(glibcxx_builddir)/scripts/extract_symvers
 baseline_subdir := $(shell $(CXX) $(baseline_subdir_switch))
-check_DEJAGNU_normal_targets = $(patsubst %,check-DEJAGNUnormal%,0 1 2 3 4 5 6 7 8 9 10)
+check_DEJAGNU_normal_targets = $(patsubst %,check-DEJAGNUnormal%,0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
 
 # Runs the testsuite, but in compile only mode.
 # Can be used to test sources with non-GNU FE's at various warning
@@ -562,7 +562,7 @@ check-DEJAGNU $(check_DEJAGNU_normal_tar
 	if [ -z "$*$(filter-out --target_board=%, $(RUNTESTFLAGS))" ] \
 	&& [ "$(filter -j, $(MFLAGS))" = "-j" ]; then \
 	  $(MAKE) $(AM_MAKEFLAGS) $(check_DEJAGNU_normal_targets); \
-	  for idx in 0 1 2 3 4 5 6 7 8 9 10; do \
+	  for idx in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15; do \
 	mv -f normal$$idx/libstdc++.sum normal$$idx/libstdc++.sum.sep; \
 	mv -f normal$$idx/libstdc++.log normal$$idx/libstdc++.log.sep; \
 	  done; \
@@ -589,25 +589,35 @@ check-DEJAGNU $(check_DEJAGNU_normal_tar
 	fi; \
 	dirs="`cd $$srcdir; echo [013-9][0-9]_*/*`";; \
 	  normal1) \
-	dirs="`cd $$srcdir; echo [ab]* de* [ep]*/*`";; \
+	dirs="`cd $$srcdir; echo experimental/* ext/[a-m]*`";; \
 	  normal2) \
-	dirs="`cd $$srcdir; echo 2[01]_*/*`";; \
+	dirs="`cd $$srcdir; echo 28_*/a*`";; \
 	  normal3) \
-	dirs="`cd $$srcdir; echo 22_*/*`";; \
+	dirs="`cd $$srcdir; echo 23_*/[lu]*`";; \
 	  normal4) \
-	dirs="`cd $$srcdir; echo 23_*/[a-km-tw-z]*`";; \
+	dirs="`cd $$srcdir; echo 2[459]_*/*`";; \
 	  normal5) \
-	dirs="`cd $$srcdir; echo 23_*/[luv]*`";; \
+	dirs="`cd $$srcdir; echo 2[01]_*/*`";; \
 	  normal6) \
-	dirs="`cd $$srcdir; echo 2[459]_*/*`";; \
+	dirs="`cd $$srcdir; echo 23_*/[m-tw-z]*`";; \
 	  normal7) \
-	dirs="`cd $$srcdir; echo 26_*/* 28_*/[c-z]*`";; \
+	dirs="`cd $$srcdir; echo 26_*/*`";; \
 	  normal8) \
 	dirs="`cd $$srcdir; echo 27_*/*`";; \
 	  normal9) \
-	dirs="`cd $$srcdir; echo 28_*/[ab]*`";; \
+	dirs="`cd $$srcdir; echo 22_*/*`";; \
 	  normal10) \
 	dirs="`cd $$srcdir; echo t*/*`";; \
+	  normal11) \
+	dirs="`cd $$srcdir; echo 28_*/b*`";; \
+	  normal12) \
+	dirs="`cd $$srcdir; echo 28_*/[c-z]*`";; \
+	  normal13) \
+	dirs="`cd $$srcdir; echo ext/[n-z]*`";; \
+	  normal14) \
+	dirs="`cd $$srcdir; echo de*/* p*/* [ab]*/* 23_*/v*`";; \
+	  normal15) \
+	dirs="`cd $$srcdir; echo 23_*/[a-k]*`";; \
 	esac; \
 	if [ -n "$*" ]; then cd "$*"; fi; \
 	if $(SHELL) -c "$$runtest --version" > /dev/null 2>&1; then \
Index: libstdc++-v3/testsuite/Makefile.am
===
--- libstdc++-v3/testsuite/Makefile.am	(revision 215147)
+++ libstdc++-v3/testsuite/Makefile.am	(working copy)
@@ -101,7 +101,7 @@ new-abi-baseline:
 	@test ! -f $*/site.exp || mv $*/site.exp $*/site.bak
 	@mv $*/site.exp.tmp $*/site.exp
 
-check_DEJAGNU_normal_targets = $(patsubst %,check-DEJAGNUnormal%,0 1 2 3 4 5 6 7 8 9 10)
+check_DEJAGNU_normal_targets = $(patsubst %,check-DEJAGNUnormal%,0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1

[COMMITTED][PATCH] Fix register corruption bug in ree

2014-09-12 Thread Jiong Wang

On 11/09/14 21:10, Jeff Law wrote:


On 09/08/14 11:21, Wilco Dijkstra wrote:

Anyway here is the modified check:

Thanks.  Just needs an updated ChangeLog and it's good for the trunk.


committed to trunk on behalf of Wilco as r215205 after pass bootstrapping.


2014-09-12  Wilco Dijkstra  

   * gcc/ree.c (combine_reaching_defs): Ensure inserted copy don't change
   the number of hard registers.

Index: gcc/ree.c
===
--- gcc/ree.c   (revision 215204)
+++ gcc/ree.c   (working copy)
@@ -743,6 +743,14 @@
   if (!SCALAR_INT_MODE_P (GET_MODE (SET_DEST (PATTERN (cand->insn)
return false;
 
+  enum machine_mode dst_mode = GET_MODE (SET_DEST (PATTERN (cand->insn)));

+  rtx src_reg = get_extended_src_reg (SET_SRC (PATTERN (cand->insn)));
+
+  /* Ensure the number of hard registers of the copy match.  */
+  if (HARD_REGNO_NREGS (REGNO (src_reg), dst_mode)
+ != HARD_REGNO_NREGS (REGNO (src_reg), GET_MODE (src_reg)))
+   return false;
+
   /* There's only one reaching def.  */
   rtx_insn *def_insn = state->defs_list[0];
 
@@ -792,7 +800,7 @@

   start_sequence ();
   rtx pat = PATTERN (cand->insn);
   rtx new_dst = gen_rtx_REG (GET_MODE (SET_DEST (pat)),
- REGNO (XEXP (SET_SRC (pat), 0)));
+ REGNO (get_extended_src_reg (SET_SRC (pat;
   rtx new_src = gen_rtx_REG (GET_MODE (SET_DEST (pat)),
  REGNO (SET_DEST (pat)));
   emit_move_insn (new_dst, new_src);







[PATCH] Fix PR63237

2014-09-12 Thread Richard Biener

The following fixes a missed gimplification of strlen in the
GIMPLE strlen folding variant.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2014-09-12  Richard Biener  

PR middle-end/63237
* gimple-fold.c (get_maxval_strlen): Gimplify string length.

* g++.dg/torture/pr63237.C: New testcase.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 215203)
+++ gcc/gimple-fold.c   (working copy)
@@ -2411,6 +2411,7 @@ gimple_fold_builtin_strlen (gimple_stmt_
   tree len = get_maxval_strlen (gimple_call_arg (stmt, 0), 0);
   if (!len)
 return false;
+  len = force_gimple_operand_gsi (gsi, len, true, NULL, true, GSI_SAME_STMT);
   replace_call_with_value (gsi, len);
   return true;
 }
Index: gcc/testsuite/g++.dg/torture/pr63237.C
===
--- gcc/testsuite/g++.dg/torture/pr63237.C  (revision 0)
+++ gcc/testsuite/g++.dg/torture/pr63237.C  (working copy)
@@ -0,0 +1,21 @@
+// { dg-do compile }
+
+class A {
+int Length;
+public:
+A(const char *p1) { Length = __builtin_strlen(p1); }
+};
+class B {
+public:
+void m_fn1(int, A);
+};
+class C {
+public:
+B &m_fn2();
+};
+int a;
+void RewriteMacrosInInput() {
+C b;
+B &c = b.m_fn2();
+c.m_fn1(0, &""[a]);
+}


Re: [PATCH] S/390: PR62662 Fix r14 restore optimization

2014-09-12 Thread Jakub Jelinek
On Wed, Sep 10, 2014 at 10:44:39AM +0200, Andreas Krebbel wrote:
> this fixes a problem with the prologue optimization in machine
> dependent reorg.  For more details please see PR62662.
> 
> No regressions on s390 and s390x.

Are you going to commit these two patches?

Jakub


RE: [wwwdocs] Mention Cilk Plus support

2014-09-12 Thread Zamyatin, Igor
> > On Wed, 10 Sep 2014, Zamyatin, Igor wrote:
> >> + Complete support for http://cilk.org";>Cilk
> >> +Plus features was added to GCC
> >> + [2014-09-02]
> 
> features *were* added, plural.

It's "support for features" thus singular :)

> 
> >> + Contributed by Jakub Jelinek, Iyer Balaji and Igor
> >> Zamyatin.
> 
> Wht?  No Aldy Hernandez?  I spent the better part of year working on
> this.  Although perhaps it's better if no one remembers that ;-).

My apologies, Aldy! :) 
Please see the new version:

Index: htdocs/index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.936
diff -p -r1.936 index.html
*** htdocs/index.html   14 Aug 2014 09:04:23 -  1.936
--- htdocs/index.html   12 Sep 2014 08:21:54 -
*** mission statement.
*** 52,57 
--- 52,62 
 
  
 
+ Cilk Plus support in GCC
+ [2014-09-02]
+ Complete support for http://cilk.org";>Cilk Plus features 
was added to GCC.
+ Contributed by Jakub Jelinek, Aldy Hernandez, Iyer Balaji and Igor 
Zamyatin.
+
  New GCC version numbering scheme announced
  [2014-08-13]
  


Thanks,
Igor


> 
> Aldy
> >
> > can you please make this "Cilk Plus support" or "Full Cilk Plus support"
> > in the title, and then use the current title as the first part of the
> > more detailed description?  (You can use the latest AVX announcement
> > as an example.)
> >
> > Gerald


Re: [wwwdocs] Buildstat update for 4.9

2014-09-12 Thread Gerald Pfeifer
On Thu, 4 Sep 2014, Raghunath Lolur wrote:
> Please find an update of test results for 4.9 as below:

Thank you, added.

Gerald


[PING] [PATCH] longlong.h: Add prototype for udiv_w_sdiv

2014-09-12 Thread Stefan Liebler

Hi,

the patch from Andreas Krebbel 
(https://gcc.gnu.org/ml/gcc-patches/2014-02/msg00194.html) adds a 
prototype for __udiv_w_sdiv to longlong.h if needed.


This fixes a build failure of glibc on s390 31 bit.
(see "Re: [PATCH] Turn implict-function-declaration warnings into 
errors", https://www.sourceware.org/ml/libc-alpha/2014-09/msg00264.html)


Please review Andreas Krebbel´s patch and give okay for commit.

Bye
Stefan



Re: Remove LIBGCC2_HAS_?F_MODE target macros

2014-09-12 Thread Richard Biener
On Fri, Sep 12, 2014 at 3:22 AM, Joseph S. Myers
 wrote:
> This patch removes the LIBGCC2_HAS_{SF,DF,XF,TF}_MODE target macros,
> replacing them by predefines with -fbuilding-libgcc, together with a
> target hook that can influence those predefines when needed.
>
> The new default is that a floating-point mode is supported in libgcc
> if (a) it passes the scalar_mode_supported_p hook (otherwise it's not
> plausible for it to be supported in libgcc) and (b) it's one of those
> four modes (since those are the modes for which libgcc hardcodes the
> possibility of support).  The target hook can override the default
> choice (in either direction) for modes that pass
> scalar_mode_supported_p (although overriding in the direction of
> returning true when the default would return false only makes sense if
> all relevant functions are specially defined in libgcc for that
> particular target).
>
> The previous default settings depended on various settings such as
> LIBGCC2_LONG_DOUBLE_TYPE_SIZE, as well as targets defining the above
> target macros if the default wasn't correct.
>
> The default scalar_mode_supported_p only declares a floating-point
> mode to be supported if it matches one of float / double / long
> double.  This means that in most cases where a mode is only supported
> conditionally in libgcc (TFmode only supported if it's the mode of
> long double, most commonly), the default gets things right.  Overrides
> were needed in the following cases:
>
> * SFmode would always have been supported in libgcc (the condition was
>   BITS_PER_UNIT == 8, true for all current targets), but pdp11
>   defaults to 64-bit float, and in that case SFmode would fail
>   scalar_mode_supported_p.  I don't know if libgcc actually built for
>   pdp11 (and the port may well no longer be being used), but this
>   patch adds a scalar_mode_supported_p hook to it to ensure SFmode is
>   treated as supported.
>
> * Certain i386 and ia64 targets need the new hook to match the
>   existing cases for when XFmode or TFmode support is present in
>   libgcc.  For i386, the hook can always declare XFmode to be
>   supported - the cases where it's not are the cases where long double
>   is TFmode, in which case XFmode fails scalar_mode_supported_p[*] -
>   but TFmode support needs to be conditional.  (And of the targets not
>   defining LIBGCC2_HAS_TF_MODE before this patch, some defined
>   LONG_DOUBLE_TYPE_SIZE to 64, so ensuring LIBGCC2_HAS_TF_MODE would
>   always be false, while others did not define it, so allowing it to
>   be true in the -mlong-double-128 case.  This patch matches that
>   logic, although I suspect all the latter targets would have been
>   broken if you tried to enable -mlong-double-128 by default, for lack
>   of the soft-fp TFmode support in libgcc, which is separately
>   configured.)
>
>   [*] I don't know if it's deliberate not to support __float80 at all
>   with -mlong-double-128.
>
> In order to implement the default version of the new hook,
> insn-modes.h was made to contain macros such as HAVE_TFmode for each
> machine mode, so the default hook can contain conditionals on whether
> XFmode and TFmode exist (to match the hardcoding of a list of modes in
> libgcc).  This is also used in fortran/trans-types.c; previously it
> had a conditional on defined(LIBGCC2_HAS_TF_MODE) (a bit dubious,
> since it ignored the value of the macro), which is replaced by testing
> defined(HAVE_TFmode), in conjunction with requiring
> targetm.libgcc_floating_mode_supported_p.
>
> (Fortran is testing something stronger than that hook: not only is
> libgcc support required, but also libm or equivalent.  Thus, it has a
> test for ENABLE_LIBQUADMATH_SUPPORT in the case that the mode is
> TFmode and that's not the same as any of the three standard types.
> The old and new tests are intended to accept exactly the same set of
> modes for all targets.)
>
> Apart from the four target macros eliminated by this patch, it gets us
> closer to eliminating LIBGCC2_LONG_DOUBLE_TYPE_SIZE as well, though a
> few more places using that macro need changing first.
>
> Bootstrapped with no regressions on x86_64-unknown-linux-gnu; also
> built cc1 for crosses to ia64-elf and pdp11-none as a minimal test of
> changes for those targets.  OK to commit?

Ok.

Thanks,
Richard.

> gcc:
> 2014-09-11  Joseph Myers  
>
> * target.def (libgcc_floating_mode_supported_p): New hook.
> * targhooks.c (default_libgcc_floating_mode_supported_p): New
> function.
> * targhooks.h (default_libgcc_floating_mode_supported_p): Declare.
> * doc/tm.texi.in (LIBGCC2_HAS_DF_MODE, LIBGCC2_HAS_XF_MODE)
> (LIBGCC2_HAS_TF_MODE): Remove.
> (TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): New @hook.
> * doc/tm.texi: Regenerate.
> * genmodes.c (emit_insn_modes_h): Define HAVE_%smode for each
> machine mode.
> * system.h (LIBGCC2_HAS_SF_MODE, LIBGCC2_HAS_DF_MODE)
> (LIBGCC2_HAS_XF_MODE, LIBGCC

Re: [debug-early] reuse variable DIEs and fix their context

2014-09-12 Thread Richard Biener
On Fri, Sep 12, 2014 at 2:51 AM, Aldy Hernandez  wrote:
> On 09/09/14 02:16, Richard Biener wrote:
>>
>> On Tue, Sep 9, 2014 at 2:00 AM, Aldy Hernandez  wrote:
>>>
>>> On 09/05/14 02:00, Richard Biener wrote:
>
>
>>> What I have in mind is:
>>>
>>> 1. Move the FE specific things that come before the call to
>>> finalize_compilation_unit currently in each LANG_HOOKS_WRITE_GLOBALS,
>>> into
>>> the FE proper (lang_hooks.parse_file).  This may or may not mean calling
>>> {wrapup,check}_global_declarations directly from the FEs since some FE's
>>> call these in a sufficiently different order to merit everyone doing
>>> their
>>> own thing (not sure though).
>
>
> Done.
>
>>>
>>> 2. Generate debug information by gathering the list of globals with
>>> lang_hooks.decls.getdecls (??) and then doing
>>> debug_hooks->early_global_decl() as discussed.
>>
>>
>> Or move that also to lang_hooks.parse_file?  ISTR
>> lang_hooks.decls.getdecls
>> is sort of an "alternative" hook to write_global_declarations that is only
>> used by the generic implementation of write_global_declarations.
>
>
> Done.
>
>>
>> So if we move everything else but calling debug_hooks->early_global_decl
>> ()
>> out of the write_global_declarations langhook then we could indeed
>> remove that hook and implement getdecls everywhere.
>>
>> I suppose one of the hooks should go in the end.
>>
>>> 2. Call finalize_compilation_unit() directly from compile_file().
>
>
> Done.
>
>>
>> Great!
>>
>>> 3. Call some (new) hook for C++ stuff after finalize_compilation_unit
>>> (???).
>>
>>
>> Or fix the C++ stuff to work properly in a symtab way?  I suppose as
>> an intermediate step adding a new langhook for this on the branch is ok
>> but I'd rather not get that merged into trunk.
>
>
> Done.  For now I've called it LANG_HOOK_POST_COMPILATION_PARSING_CLEANUPS,
> and it is only applicable to C++, unless some other FE acts up in the
> process and needs similar massaging.
>
>> Maybe Jason can help cleaning this up.
>
>
> Jason's not much of a beer drinker AFAICT, so I'm trying to come up with a
> suitable bribe.

I'm sure you can come up with something ;)

>>
>>> 4. FOR_EACH_DEFINED_SYMBOL (node)
>>>   debug_hooks->late_global_decl (node->decl)
>>>
>>> as suggested.
>
>
> Done.
>
> [Well... as DONE as a prototype can be :).  This is a work in progress, but
> I'd like y'all to peek at it, to make sure I'm not making obvious wrong
> turns that will have me rewriting code months from now, and hating you in
> the process.  And by you, I mean Jason *and* you.  I don't want anyone to
> feel left out by my frustration and anger.]
>
> I drafted what I want Ada, Java, Fortran, and Go to look like (as well as
> the obvious C/C++ languages).
>
> For C, guality.exp exhibits less failures than mainline.  I'm currently
> debugging inline virtual C++ destructors.  It seems the inliner can also
> call generate debugging info (debug_hooks->outlining_inline_function). The
> rest of the languages are tested as far as building jc1/f951/go1 with no
> warnings :-))).
>
> There are various cleanups and comments along the way.
>
> Let me know if you're "mostly" OK with this, so I can push this to the
> branch and continue iterating with you incrementally.  It seems there will
> be no shortage of weird bugs in dwarf generation due to the fact that we
> stream early.  I'm hoping to start concentrating on those...

Yeah, it looks very good.  Let's cross fingers that it'll work ;)

Thanks,
Richard.

> As usual, thanks.
> Aldy


Re: ptx preliminary rtl patches [3/4]

2014-09-12 Thread Richard Biener
On Thu, Sep 11, 2014 at 6:36 PM, Bernd Schmidt  wrote:
> On 09/11/2014 06:34 PM, Steven Bosscher wrote:
>>
>> On Thu, Sep 11, 2014 at 6:19 PM, Bernd Schmidt wrote:
>
> What do you expect that function to do different? It returns the
> correct
> value.
>

 No different. Just that if you want to check whether DECL is a global
 variable then we have a predicate for it. So why use TREE_STATIC
 instead?

 In other words: Just trying to make/keep certain checks consistent. (A
 hopeless cause, but a noble one... ;-)
>>>
>>>
>>>
>>> You're talking about a different patch here. This one is about
>>> num_sign_bit_copies.
>>
>>
>>
>> Ah. *sigh* can't even keep two patches in my mind at any one time.
>>
>> The point about num_sign_bit_copies is that it doesn't really return
>> the correct value IMHO, if there isn't really a correct value to speak
>> of: What is the sign of TRUE or FALSE, the only two values a BImode
>> value can take?
>>
>> A 1-bit precision integer can have value 0 or -1 and in that case
>> num_sign_bit_copies should be 0. But for a BImode value, it seems to
>> me that asking for the sign bit or sign bit copies is just wrong.
>
>
> I strongly disagree. It's the same as for any other integer - there's one
> sign bit, and since there aren't any other bits, the number of sign bit
> copies is always exactly 1.

I agree about that.  But I fail to see what goes wrong with the existing
code in combine.  Maybe the code simply doesn't work for
GET_MODE_PRECISION != GET_MODE_BITSIZE?

At least your new comment doesn't explain anything to me.

Richard.


>
> Bernd
>
>


Re: Stream ODR types

2014-09-12 Thread Richard Biener
On Thu, 11 Sep 2014, Jan Hubicka wrote:

> > On Thu, 11 Sep 2014, Jan Hubicka wrote:
> > 
> > > Hi,
> > > this patch adds computation and streaming of mangled type names.  As 
> > > suggested by Jason,
> > > it simple calls DECL_ASSEMBLER_NAME on all names types and lets C++ 
> > > supply them.
> > > This makes it possible to stablish precise ODR type equivalency at LTO 
> > > (till now we can
> > > do that only for complete class types with virtual methods attached to 
> > > them).
> > > Lto type merging is then updated to register all types into the ODR type 
> > > hash.  This
> > > makes warnings to be output for ODR violations. Here are ones output for 
> > > Firefox:
> > > http://kam.mff.cuni.cz/~hubicka/odr-warnings-firefox.txt
> > > 
> > > As discussed earlier, in addition to ODR warnings that seems useful, I 
> > > would
> > > like to use it for TBAA analysis for ODR types that are not structurally
> > > equivalent to non-ODR types, so C++ programs will get better alias 
> > > analysis and
> > > for other tricks, such as more agresively merging ODR types.
> > > 
> > > I believe this makes sense (is orthogonal) with early debug info (for 
> > > warnings, TBAA
> > > and devirtualization).  It can be also used to more agresively merge 
> > > debug information
> > > as done by LLVM.
> > > 
> > > The change increase LTO object fules by about 2% (uncompressed by 6%) and 
> > > also
> > > increase WPA memory use and streaming times by about same percentage.  It 
> > > is
> > > not small and thus I made it optional (enabled by default for now).  We 
> > > could see
> > > how benefits relate to this cost once the other three parts are 
> > > implemented.
> > > 
> > > Bootstrapped/regtested x86_64-linux, seems sane?
> > 
> > It looks sane, but when early debug is completed we likely will drop
> > all the elaborated types from decls.  Thus to keep the ODR type you'd
> > have to keep (and compute early as well) their DECL_ASSEMBLER_NAME?
> 
> I currently compute it in free_lang_data.  Obviously we can compute earlier
> (in the frontend) as fit.
> > 
> > Can't we just store a hash of the assembler name?  From alias analysis
> > perspective false aliasing due to a hash collision is harmless, no?
> > Maybe not for ODR warnings though.  At least a hash would be way
> > cheaper than those usually very large strings
> 
> Hmm, interesting idea.  False positives are harmless for alias analysis, they
> do matter for type inheritance graph construction but if we decide we will 
> ever
> care only about polymorphic types, we can always use the virtual table name to
> resolve conflicts.
> 
> We will get false ODR violation warnings, but the chances would be very low.
> > 
> > You probably want to restrict ODR types to aggregates?
> 
> For ODR warnings and TBAA I think i want other types, too.  But yep, we need 
> to handle
> gracefuly component types that does not have names and we could drop names of 
> types
> and handle them as component types as it seems fit.
> 
> OK, so if you agree, I will go ahead with this patch and we can resolve these 
> details
> incrementally.

Yes, but please disable !record type handing for now.

Richard.


Re: Stream ODR types

2014-09-12 Thread Richard Biener
On Thu, 11 Sep 2014, Jason Merrill wrote:

> On 09/11/2014 03:06 AM, Jan Hubicka wrote:
> > http://kam.mff.cuni.cz/~hubicka/odr-warnings-firefox.txt
> 
> > /aux/hubicka/firefox4/content/media/fmp4/ffmpeg/libav53/include/libavcodec/avcodec.h:997:0:
> > note: the first difference of corresponding definitions is field ‘data’
> >  uint8_t *data[AV_NUM_DATA_POINTERS];
> >  ^
> > /aux/hubicka/firefox4/content/media/fmp4/ffmpeg/libav54/include/libavcodec/avcodec.h:997:0:
> > note: a field of same name but different type is defined in another
> > translation unit
> >  uint8_t *data[AV_NUM_DATA_POINTERS];
> >  ^
> > /usr/include/stdint.h:49:24: note: type ‘uint8_t’ should match type
> > ‘uint8_t’
> >  typedef unsigned char  uint8_t;
> > ^
> > /usr/include/stdint.h:49:24: note: the incompatible type is defined here
> 
> Hmm, how can uint8_t be incompatible with itself?

One reason I was suggesting of only handling record types and
arrays of records.  For fundamental types we can very easily verify
compatibility.

Richard.

Re: RFA: Add a destructor to target_ira_int

2014-09-12 Thread Richard Sandiford
Jeff Law  writes:
> On 09/08/14 09:26, Richard Sandiford wrote:
>> This patch adds a destructor to target_ira_int, so that the data structures
>> it points to are freed when the parent target_globals is freed.  It fixes
>> a memory leak with non-default subtargets.
>>
>> Tested on x86_64-linux-gnu.  OK to install?
>>
>> Thanks,
>> Richard
>>
>>
>> gcc/
>>  * ira.h (ira_finish_once): Delete.
>>  * ira-int.h (target_ira_int::~target_ira_int): Declare.
>>  (target_ira_int::free_ira_costs): Likewise.
>>  (target_ira_int::free_register_move_costs): Likewise.
>>  (ira_finish_costs_once): Delete.
>>  * ira.c (free_register_move_costs): Replace with...
>>  (target_ira_int::free_register_move_costs): ...this new function.
>>  (target_ira_int::~target_ira_int): Define.
>>  (ira_init): Call free_register_move_costs as a member function rather
>>  than a global function.
>>  (ira_finish_once): Delete.
>>  * ira-costs.c (free_ira_costs): Replace with...
>>  (target_ira_int::free_ira_costs): ...this new function.
>>  (ira_init_costs): Call free_ira_costs as a member function rather
>>  than a global function.
>>  (ira_finish_costs_once): Delete.
>>  * target-globals.c (target_globals::~target_globals): Call the
>>  target_ira_int destructor.
>>  * toplev.c: Include lra.h.
>>  (finalize): Call lra_finish_once rather than ira_finish_once.
> Consider making target_ira_int a class.  OK for the trunk.
>
> jeff

I'd prefer to keep it as a struct if that's OK.  At the moment these
target structures are just collections of variables that are accessed
directly, so it doesn't really feel like a proper OO class "yet".
Also (more minor) changing it from a struct to a class would mean
updating all references to the structure, to avoid the clang warning
about mismatched tags.  There would then be some weird-looking
inconsistencies in the target-globals code.

Thanks,
Richard