Re: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Marc Glisse

On Wed, 2 Apr 2014, Thomas Preud'homme wrote:

Note that as it stands the patch does not work for arrays indexed with 
variable (such a tab[a] || (tab[a+1]  8)) because fold_const does not 
fold (a + 1) - a.


Uh? It does fold a+1-a for me. What it doesn't do is look through the 
definition of b in b-a. Richard+GSoC will supposedly soon provide a 
function that does that.


--
Marc Glisse


Re: Fix various x86 tests for --with-arch=bdver3 --with-cpu=bdver3

2014-04-02 Thread Uros Bizjak
On Wed, Apr 2, 2014 at 12:27 AM, Joseph S. Myers
jos...@codesourcery.com wrote:
 When I fixed various tests in
 http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01662.html for failures
 with --with-arch=bdver3, I missed that a so-configured compiler still
 defaults to -mtune=generic.  If you override that as well with
 --with-cpu=bdver3, further failures appear, and this patch fixes some
 of them.

 Most of these changes add -mno-prefer-avx128 to AVX tests not
 expecting a -mprefer-avx128 default.  In addition, some tests have
 -mtune=generic added where the behavior tested for depends on some
 tuning parameter that I identified: X86_TUNE_EXT_80387_CONSTANTS or
 X86_TUNE_SSE_LOAD0_BY_PXOR.

 Tested x86_64-linux-gnu.  OK to commit?

 There are other failures this patch does not resolve in a
 --with-arch=bdver3 --with-cpu=bdver3 configuration.  Some of these are
 AVX tests whose failures are not resolved by adding -mno-prefer-avx128
 (and so this patch does not add -mno-prefer-avx128 to those tests);
 others may be cases where -mtune=generic is appropriate but I haven't
 identified the specific tuning parameter that shows code generation
 differences depending on tuning are correct and so a -mtune= option
 should be used.

 FAIL: gcc.target/i386/avx2-vpand-1.c scan-assembler vpand[ 
 \\t]+[^\n]*%ymm[0-9]
 FAIL: gcc.target/i386/avx2-vpand-3.c scan-assembler-times vpand[ 
 \\t]+[^\n]*%ymm[0-9] 1
 FAIL: gcc.target/i386/avx2-vpandn-1.c scan-assembler vpandn[ 
 \\t]+[^\n]*%ymm[0-9]
 FAIL: gcc.target/i386/avx2-vpor-1.c scan-assembler vpor[ \\t]+[^\n]*%ymm[0-9]
 FAIL: gcc.target/i386/avx2-vpxor-1.c scan-assembler vpxor[ 
 \\t]+[^\n]*%ymm[0-9]
 FAIL: gcc.target/i386/avx256-unaligned-load-2.c scan-assembler 
 (sse2_loaddqu|vmovdqu[^\n\r]*movv16qi_internal)
 FAIL: gcc.target/i386/avx256-unaligned-load-2.c scan-assembler vinsert.128
 FAIL: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vmovdqa64[ 
 \\t]+%zmm 2
 FAIL: gcc.target/i386/avx512f-vmovdqu32-1.c scan-assembler-times 
 vmovdqu[36][24][ \\t]+[^\n]*\\)[^\n]*%zmm[0-9][^{] 1
 FAIL: gcc.target/i386/avx512f-vmovupd-1.c scan-assembler-times vmovupd[ 
 \\t]+[^\n]*\\)[^\n]*%zmm[0-9][^{] 1
 FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[ 
 \\t]+[^\n]*%zmm[0-9][^{] 4
 FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[ 
 \\t]+[^\n]*%zmm[0-9][^{] 4
 FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[ 
 \\t]+[^\n]*%zmm[0-9][^{] 3
 FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[ 
 \\t]+[^\n]*%zmm[0-9][^{] 3
 FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[ 
 \\t]+[^\n]*%zmm[0-9][^{] 4
 FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[ 
 \\t]+[^\n]*%zmm[0-9][^{] 3
 FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[ 
 \\t]+[^\n]*%zmm[0-9][^{] 4
 FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[ 
 \\t]+[^\n]*%zmm[0-9][^{] 3
 FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/pr49002-1.c scan-assembler vmovapd[\t ]*[^,]*,[\t ]*%xmm
 FAIL: gcc.target/i386/pr53712.c scan-assembler-times movdqu 1
 FAIL: gcc.target/i386/pr53907.c scan-assembler movdqa
 FAIL: gcc.target/i386/pr59539-1.c scan-assembler-times vmovdqu 1
 FAIL: 

RE: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Thomas Preud'homme
 From: Marc Glisse [mailto:marc.gli...@inria.fr]
 
 Uh? It does fold a+1-a for me. What it doesn't do is look through the
 definition of b in b-a. Richard+GSoC will supposedly soon provide a
 function that does that.

Oh right, it's a bit more complex here since the array index is converted
to an offset first. So the operation is more like:  ((a+1)*cst) - (a*cst).
Any chances this might be handled at some point? Note that this might
not be very frequent so it's not very important for this patch.

Thanks for the comment.

Best regards,

Thomas




RFA: RL78: Fix handling of (SUBREG (SYMBOL_REF))

2014-04-02 Thread Nick Clifton
Hi DJ,

  The patch below is to fix a snafu I made whilst fixing some problems
  with the RL78 port a while ago.  GCC was generating
  (SUBREG (SYMBOL_REF) n) which made no sense to me, so I had the
  movqi expander just fail when it encountered them.  Now that I have
  more idea about why they are created - installing symbolic values into
  bitfields or packed structure fields - I have found that it is
  necessary to support them.  Failure is not an option as GCC will just
  silently omit generating any code at all.

  Tested with an rl78-elf toolchain without any regressions.  OK to
  apply ?

Cheers
  Nick

gcc/ChangeLog
2014-04-01  Nick Clifton  ni...@redhat.com

* config/rl78/rl78-expand.md (movqi): Handle (SUBREG (SYMBOL_REF))
properly.

Index: gcc/config/rl78/rl78-expand.md
===
--- gcc/config/rl78/rl78-expand.md  (revision 209009)
+++ gcc/config/rl78/rl78-expand.md  (working copy)
@@ -30,18 +30,23 @@
 if (rl78_far_p (operands[0])  rl78_far_p (operands[1]))
   operands[1] = copy_to_mode_reg (QImode, operands[1]);
 
-/* FIXME: Not sure how GCC can generate (SUBREG (SYMBOL_REF)),
-   but it does.  Since this makes no sense, reject it here.  */
+/* GCC can generate (SUBREG (SYMBOL_REF)) when it has to store a symbol
+   into a bitfield, or a packed ordinary field.  We can handle this
+   provided that the destination is a register.  If not, then load the
+   source into a register first.  */
 if (GET_CODE (operands[1]) == SUBREG
- GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF)
-  FAIL;
+ GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF
+! REG_P (operands[0]))
+   operands[1] = copy_to_mode_reg (QImode, operands[1]);
+
 /* Similarly for (SUBREG (CONST (PLUS (SYMBOL_REF.
cf. g++.dg/abi/packed.C.  */
 if (GET_CODE (operands[1]) == SUBREG
 GET_CODE (XEXP (operands[1], 0)) == CONST
  GET_CODE (XEXP (XEXP (operands[1], 0), 0)) == PLUS
- GET_CODE (XEXP (XEXP (XEXP (operands[1], 0), 0), 0)) == SYMBOL_REF)
-  FAIL;
+ GET_CODE (XEXP (XEXP (XEXP (operands[1], 0), 0), 0)) == SYMBOL_REF
+! REG_P (operands[0]))
+   operands[1] = copy_to_mode_reg (QImode, operands[1]);
 
 if (CONST_INT_P (operands[1])  ! IN_RANGE (INTVAL (operands[1]), (-1  
8) + 1, (1  8) - 1))
   FAIL;


Re: [gomp4] Add tables generation

2014-04-02 Thread Thomas Schwinge
Hi!

On Thu, 20 Mar 2014 17:50:13 +0100, Bernd Schmidt ber...@codesourcery.com 
wrote:
 This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
 adds functionality to build function/variable tables that will allow 
 libgomp to look up offload target code based on the address of the 
 corresponding host function. There are two alternatives, one based on 
 named sections, and one based on a target hook when named sections are 
 unavailable (as on ptx).
 
 Committed on gomp-4_0-branch.

I see regressions in the libgomp testsuite for configurations where
offloading is not enabled:

spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ 
[...]/source/libgomp/testsuite/libgomp.c/for-3.c 
-B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ 
-B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs 
-I[...]/build/x86_64-unknown-linux-gnu/./libgomp 
-I[...]/source/libgomp/testsuite/.. -fmessage-length=0 
-fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 
-fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o 
./for-3.exe
/tmp/ccGnT0ei.o: In function `main':
for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
collect2: error: ld returned 1 exit status

I suppose that's because even if...

 --- gcc/configure.ac  (revision 208715)
 +++ gcc/configure.ac  (working copy)
 @@ -887,6 +887,10 @@ AC_SUBST(enable_accelerator)
  offload_targets=`echo $offload_targets | sed -e 's#,#:#'`
  AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, $offload_targets,
   [Define to hold the list of target names suitable for offloading.])
 +if test x$offload_targets != x; then
 +  AC_DEFINE(ENABLE_OFFLOADING, 1,
 +[Define this to enable support for offloading.])
 +fi

... offloading is not enabled, this...

 --- gcc/omp-low.c (revision 208706)
 +++ gcc/omp-low.c (working copy)
 @@ -8671,19 +8672,22 @@ expand_omp_target (struct omp_region *re
  }
  
gimple g;
 -  /* FIXME: This will be address of
 - extern char __OPENMP_TARGET__[] __attribute__((visibility (hidden)))
 - symbol, as soon as the linker plugin is able to create it for us.  */
 -  tree openmp_target = build_zero_cst (ptr_type_node);
 +  tree openmp_target
 += build_decl (UNKNOWN_LOCATION, VAR_DECL,
 +   get_identifier (__OPENMP_TARGET__), ptr_type_node);
 +  TREE_PUBLIC (openmp_target) = 1;
 +  DECL_EXTERNAL (openmp_target) = 1;
if (kind == GF_OMP_TARGET_KIND_REGION)
  {
tree fnaddr = build_fold_addr_expr (child_fn);
 -  g = gimple_build_call (builtin_decl_explicit (start_ix), 7,
 -  device, fnaddr, openmp_target, t1, t2, t3, t4);
 +  g = gimple_build_call (builtin_decl_explicit (start_ix), 7, device,
 +  fnaddr, build_fold_addr_expr (openmp_target),
 +  t1, t2, t3, t4);
  }
else
 -g = gimple_build_call (builtin_decl_explicit (start_ix), 6,
 -device, openmp_target, t1, t2, t3, t4);
 +g = gimple_build_call (builtin_decl_explicit (start_ix), 6, device,
 +build_fold_addr_expr (openmp_target),
 +t1, t2, t3, t4);

... will now cause a reference to __OPENMP_TARGET__, but...

 --- libgcc/crtstuff.c (revision 208706)
 +++ libgcc/crtstuff.c (working copy)
 @@ -311,6 +311,15 @@ register_tm_clones (void)
  }
  #endif /* USE_TM_CLONE_REGISTRY */
  
 +#if defined(HAVE_GAS_HIDDEN)  defined(ENABLE_OFFLOADING)
 +void *_omp_func_table[0]
 +  __attribute__ ((__used__, visibility (protected),
 +   section (.offload_func_table_section))) = { };
 +void *_omp_var_table[0]
 +  __attribute__ ((__used__, visibility (protected),
 +   section (.offload_var_table_section))) = { };
 +#endif
 +
  #if defined(INIT_SECTION_ASM_OP) || defined(INIT_ARRAY_SECTION_ASM_OP)
  
  #ifdef OBJECT_FORMAT_ELF
 @@ -752,6 +761,23 @@ __do_global_ctors (void)
  #error What are you doing with crtstuff.c, then?
  #endif
  
 +#if defined(HAVE_GAS_HIDDEN)  defined(ENABLE_OFFLOADING)
 +void *_omp_funcs_end[0]
 +  __attribute__ ((__used__, visibility (protected),
 +   section (.offload_func_table_section))) = { };
 +void *_omp_vars_end[0]
 +  __attribute__ ((__used__, visibility (protected),
 +   section (.offload_var_table_section))) = { };
 +extern void *_omp_func_table[];
 +extern void *_omp_var_table[];
 +void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ (protected))) =
 +{
 +  _omp_func_table, _omp_funcs_end,
 +  _omp_var_table, _omp_vars_end
 +};
 +#endif

... __OPENMP_TARGET__ is not being defined here for the
!ENABLE_OFFLOADING case.  In
http://news.gmane.org/find-root.php?message_id=%3C20130905082455.GH23437%40tucnak.redhat.com%3E,
Jakub had suggested this to be a weak symbol, so we'd get NULL in this
case, which would be what's needed here, I think?


Also, I'd suggest to rename __OPENMP_TARGET__ (and similar ones) to
__GNU_OFFLOAD__ (or similar).  As we're using this offloading stuff for

Re: [PATCH] Guard special installs in install-driver

2014-04-02 Thread Richard Biener
On Tue, 1 Apr 2014, Mike Stump wrote:

 On Mar 31, 2014, at 4:50 AM, Richard Biener rguent...@suse.de wrote:
  -$(INSTALL_PROGRAM) xgcc$(exeext) 
  $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext)
  !   -rm -f 
  $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-$(version)$(exeext)
  !   -( cd $(DESTDIR)$(bindir)  \
  !  $(LN) $(GCC_INSTALL_NAME)$(exeext) 
  $(target_noncanonical)-gcc-$(version)$(exeext) )
  !   -if [ ! -f gcc-cross$(exeext) ] ; then \
rm -f $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-tmp$(exeext); \
( cd $(DESTDIR)$(bindir)  \
  $(LN) $(GCC_INSTALL_NAME)$(exeext) 
  $(target_noncanonical)-gcc-tmp$(exeext)  \
  --- 3205,3217 
   install-driver: installdirs xgcc$(exeext)
  -rm -f $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext)
  -$(INSTALL_PROGRAM) xgcc$(exeext) 
  $(DESTDIR)$(bindir)/$(GCC_INSTALL_NAME)$(exeext)
  !   -if [ $(GCC_INSTALL_NAME) != $(target_noncanonical)-gcc-$(version) 
  ]; then \
  ! -rm -f 
  $(DESTDIR)$(bindir)/$(target_noncanonical)-gcc-$(version)$(exeext) \
  ! -( cd $(DESTDIR)$(bindir)  \
  !$(LN) $(GCC_INSTALL_NAME)$(exeext) 
  $(target_noncanonical)-gcc-$(version)$(exeext) ) \
  !   fi
 
 Certainly safer for release like this, but, gotta wonder if we can avoid 
 the ignoring of errors with the added check…

No idea ;)  For my case I ended up without an installed driver as
the rm of course succeeded but the rest not ...

 I’d have to work out why 
 they did that in the first place and run a build and play a bit to be as 
 sure as I’d like to be… but, a cross and a native build I think should 
 test it adequately.

Work out why we install _two_ additional variants!  (or rather why we
install any additional variants to GCC_INSTALL_NAME at all ...).

Anyway, I now committed the patch.  We can always followup with
cleanups to this area later, possibly in stage1.

Richard.

Re: Fix various x86 tests for --with-arch=bdver3 --with-cpu=bdver3

2014-04-02 Thread Uros Bizjak
On Wed, Apr 2, 2014 at 12:27 AM, Joseph S. Myers
jos...@codesourcery.com wrote:

 There are other failures this patch does not resolve in a
 --with-arch=bdver3 --with-cpu=bdver3 configuration.  Some of these are
 AVX tests whose failures are not resolved by adding -mno-prefer-avx128
 (and so this patch does not add -mno-prefer-avx128 to those tests);
 others may be cases where -mtune=generic is appropriate but I haven't
 identified the specific tuning parameter that shows code generation
 differences depending on tuning are correct and so a -mtune= option
 should be used.

 FAIL: gcc.target/i386/avx2-vpand-1.c scan-assembler vpand[ 
 \\t]+[^\n]*%ymm[0-9]
 FAIL: gcc.target/i386/avx2-vpand-3.c scan-assembler-times vpand[ 
 \\t]+[^\n]*%ymm[0-9] 1
 FAIL: gcc.target/i386/avx2-vpandn-1.c scan-assembler vpandn[ 
 \\t]+[^\n]*%ymm[0-9]
 FAIL: gcc.target/i386/avx2-vpor-1.c scan-assembler vpor[ \\t]+[^\n]*%ymm[0-9]
 FAIL: gcc.target/i386/avx2-vpxor-1.c scan-assembler vpxor[ 
 \\t]+[^\n]*%ymm[0-9]
 FAIL: gcc.target/i386/avx256-unaligned-load-2.c scan-assembler 
 (sse2_loaddqu|vmovdqu[^\n\r]*movv16qi_internal)
 FAIL: gcc.target/i386/avx256-unaligned-load-2.c scan-assembler vinsert.128
 FAIL: gcc.target/i386/avx512f-vec-init.c scan-assembler-times vmovdqa64[ 
 \\t]+%zmm 2
 FAIL: gcc.target/i386/avx512f-vmovdqu32-1.c scan-assembler-times 
 vmovdqu[36][24][ \\t]+[^\n]*\\)[^\n]*%zmm[0-9][^{] 1
 FAIL: gcc.target/i386/avx512f-vmovupd-1.c scan-assembler-times vmovupd[ 
 \\t]+[^\n]*\\)[^\n]*%zmm[0-9][^{] 1
 FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[ 
 \\t]+[^\n]*%zmm[0-9][^{] 4
 FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpandd-1.c scan-assembler-times vpandd[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[ 
 \\t]+[^\n]*%zmm[0-9][^{] 4
 FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpandnd-1.c scan-assembler-times vpandnd[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[ 
 \\t]+[^\n]*%zmm[0-9][^{] 3
 FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpandnq-1.c scan-assembler-times vpandnq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[ 
 \\t]+[^\n]*%zmm[0-9][^{] 3
 FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpandq-1.c scan-assembler-times vpandq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[ 
 \\t]+[^\n]*%zmm[0-9][^{] 4
 FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpord-1.c scan-assembler-times vpord[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[ 
 \\t]+[^\n]*%zmm[0-9][^{] 3
 FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vporq-1.c scan-assembler-times vporq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[ 
 \\t]+[^\n]*%zmm[0-9][^{] 4
 FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpxord-1.c scan-assembler-times vpxord[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[ 
 \\t]+[^\n]*%zmm[0-9][^{] 3
 FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}[^{] 1
 FAIL: gcc.target/i386/avx512f-vpxorq-1.c scan-assembler-times vpxorq[ 
 \\t]+[^\n]*%zmm[0-9]{%k[1-7]}{z} 1
 FAIL: gcc.target/i386/pr49002-1.c scan-assembler vmovapd[\t ]*[^,]*,[\t ]*%xmm
 FAIL: gcc.target/i386/pr53712.c scan-assembler-times movdqu 1
 FAIL: gcc.target/i386/pr53907.c scan-assembler movdqa
 FAIL: gcc.target/i386/pr59539-1.c scan-assembler-times vmovdqu 1
 FAIL: gcc.target/i386/pr59539-2.c scan-assembler-times vmovdqu 1

These are due to TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL tuning flag.
Currently, this flag applies to all vector sizes (128, 256 and 512
bits), but I guess it is effective only for 128 bit sizes. Can you
please review usage of this flag in i386/sse.md?

Thanks,
Uros.


Re: [4.8, PATCH 9/26] Backport Power8 and LE support: ABI call support

2014-04-02 Thread Richard Biener
On Wed, 19 Mar 2014, Bill Schmidt wrote:

 Hi,
 
 This patch (diff-abi-calls) backports fixes to common code to support
 the new ELFv2 ABI.  Copying Richard and Jakub for these bits.

Ok.

Thanks,
Richard.

 Thanks,
 Bill
 
 
 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com
 
   Backport from mainline r204798:
 
   2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com
   Alan Modra  amo...@gmail.com
 
   * function.c (assign_parms): Use all.reg_parm_stack_space instead
   of re-evaluating REG_PARM_STACK_SPACE target macro.
   (locate_and_pad_parm): New parameter REG_PARM_STACK_SPACE.  Use it
   instead of evaluating target macro REG_PARM_STACK_SPACE every time.
   (assign_parm_find_entry_rtl): Update call.
   * calls.c (initialize_argument_information): Update call.
   (emit_library_call_value_1): Likewise.
   * expr.h (locate_and_pad_parm): Update prototype.
 
   Backport from mainline r204797:
 
   2013-11-14  Ulrich Weigand  ulrich.weig...@de.ibm.com
 
   * calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
   arguments.
 
   Backport from mainline r197003:
 
   2013-03-23  Eric Botcazou  ebotca...@adacore.com
 
   * calls.c (expand_call): Add missing guard to code handling return
   of non-BLKmode structures in MSB.
   * function.c (expand_function_end): Likewise.
 
 
 Index: gcc-4_8-branch/gcc/calls.c
 ===
 --- gcc-4_8-branch.orig/gcc/calls.c   2013-12-28 17:41:32.056627059 +0100
 +++ gcc-4_8-branch/gcc/calls.c2013-12-28 17:50:43.356356135 +0100
 @@ -983,6 +983,7 @@ store_unaligned_arguments_into_pseudos (
  
for (i = 0; i  num_actuals; i++)
  if (args[i].reg != 0  ! args[i].pass_on_stack
 +  GET_CODE (args[i].reg) != PARALLEL
args[i].mode == BLKmode
MEM_P (args[i].value)
(MEM_ALIGN (args[i].value)
 @@ -1327,6 +1328,7 @@ initialize_argument_information (int num
  #else
args[i].reg != 0,
  #endif
 +  reg_parm_stack_space,
args[i].pass_on_stack ? 0 : args[i].partial,
fndecl, args_size, args[i].locate);
  #ifdef BLOCK_REG_PADDING
 @@ -3171,7 +3173,9 @@ expand_call (tree exp, rtx target, int i
group load/store machinery below.  */
if (!structure_value_addr
  !pcc_struct_value
 +TYPE_MODE (rettype) != VOIDmode
  TYPE_MODE (rettype) != BLKmode
 +REG_P (valreg)
  targetm.calls.return_in_msb (rettype))
   {
 if (shift_return_value (TYPE_MODE (rettype), false, valreg))
 @@ -3734,7 +3738,8 @@ emit_library_call_value_1 (int retval, r
  #else
  argvec[count].reg != 0,
  #endif
 -0, NULL_TREE, args_size, argvec[count].locate);
 +reg_parm_stack_space, 0,
 +NULL_TREE, args_size, argvec[count].locate);
  
if (argvec[count].reg == 0 || argvec[count].partial != 0
 || reg_parm_stack_space  0)
 @@ -3821,7 +3826,7 @@ emit_library_call_value_1 (int retval, r
  #else
  argvec[count].reg != 0,
  #endif
 -argvec[count].partial,
 +reg_parm_stack_space, argvec[count].partial,
  NULL_TREE, args_size, argvec[count].locate);
 args_size.constant += argvec[count].locate.size.constant;
 gcc_assert (!argvec[count].locate.size.var);
 Index: gcc-4_8-branch/gcc/function.c
 ===
 --- gcc-4_8-branch.orig/gcc/function.c2013-12-28 17:41:32.056627059 
 +0100
 +++ gcc-4_8-branch/gcc/function.c 2013-12-28 17:50:43.362356165 +0100
 @@ -2507,6 +2507,7 @@ assign_parm_find_entry_rtl (struct assig
  }
  
locate_and_pad_parm (data-promoted_mode, data-passed_type, in_regs,
 +all-reg_parm_stack_space,
  entry_parm ? data-partial : 0, current_function_decl,
  all-stack_args_size, data-locate);
  
 @@ -3485,11 +3486,7 @@ assign_parms (tree fndecl)
/* Adjust function incoming argument size for alignment and
   minimum length.  */
  
 -#ifdef REG_PARM_STACK_SPACE
 -  crtl-args.size = MAX (crtl-args.size,
 - REG_PARM_STACK_SPACE (fndecl));
 -#endif
 -
 +  crtl-args.size = MAX (crtl-args.size, all.reg_parm_stack_space);
crtl-args.size = CEIL_ROUND (crtl-args.size,
  PARM_BOUNDARY / BITS_PER_UNIT);
  
 @@ -3693,6 +3690,9 @@ gimplify_parameters (void)
 IN_REGS is nonzero if the argument will be passed in registers.  It will
 never be set if REG_PARM_STACK_SPACE is not defined.
  
 +   REG_PARM_STACK_SPACE is the number of bytes of stack space reserved
 +   for arguments which are passed in registers.
 +
   

Re: [4.8, PATCH 15/26] Backport Power8 and LE support: PR54537

2014-04-02 Thread Richard Biener
On Wed, 19 Mar 2014, Bill Schmidt wrote:

 Hi,
 
 This patch (diff-pr54537) backports a fix for PR54537 which is unrelated
 but necessary.  Copying Richard and Jakub for the common code.

Ok.

Thanks,
Richard.

 Thanks,
 Bill
 
 
 [libstdc++-v3]
 
 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com
 
 Backport from mainline
   2013-08-01  Fabien Chêne  fab...@gcc.gnu.org
 
   PR c++/54537
   * include/tr1/cmath: Remove pow(double,double) overload, remove a
   duplicated comment about DR 550. Add a comment to explain the issue.
   * testsuite/tr1/8_c_compatibility/cmath/pow_cmath.cc: New.
 
 [gcc/cp]
 
 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com
 
   Back port from mainline
   2013-08-01  Fabien Chêne  fab...@gcc.gnu.org
 
   PR c++/54537
   * cp-tree.h: Check OVL_USED with OVERLOAD_CHECK.
   * name-lookup.c (do_nonmember_using_decl): Make sure we have an
   OVERLOAD before calling OVL_USED. Call diagnose_name_conflict
   instead of issuing an error without mentioning the conflicting
   declaration.
 
 [gcc/testsuite]
 
 2014-03-29  Bill Schmidt  wschm...@linux.vnet.ibm.com
 
   Back port from mainline
   2013-08-01  Fabien Chêne  fab...@gcc.gnu.org
   Peter Bergner  berg...@vnet.ibm.com
 
   PR c++/54537
   * g++.dg/overload/using3.C: New.
   * g++.dg/overload/using2.C: Adjust.
   * g++.dg/lookup/using9.C: Likewise.
 
 
 Index: gcc-4_8-test/gcc/cp/cp-tree.h
 ===
 --- gcc-4_8-test.orig/gcc/cp/cp-tree.h
 +++ gcc-4_8-test/gcc/cp/cp-tree.h
 @@ -331,7 +331,7 @@ typedef struct ptrmem_cst * ptrmem_cst_t
  /* If set, this was imported in a using declaration.
 This is not to confuse with being used somewhere, which
 is not important for this node.  */
 -#define OVL_USED(NODE)   TREE_USED (NODE)
 +#define OVL_USED(NODE)   TREE_USED (OVERLOAD_CHECK (NODE))
  /* If set, this OVERLOAD was created for argument-dependent lookup
 and can be freed afterward.  */
  #define OVL_ARG_DEPENDENT(NODE) TREE_LANG_FLAG_0 (OVERLOAD_CHECK
 (NODE))
 Index: gcc-4_8-test/gcc/cp/name-lookup.c
 ===
 --- gcc-4_8-test.orig/gcc/cp/name-lookup.c
 +++ gcc-4_8-test/gcc/cp/name-lookup.c
 @@ -2286,8 +2286,7 @@ push_overloaded_decl_1 (tree decl, int f
  compparms (TYPE_ARG_TYPES (TREE_TYPE (fn)),
   TYPE_ARG_TYPES (TREE_TYPE (decl)))
  ! decls_match (fn, decl))
 - error (%q#D conflicts with previous using declaration %q#D,
 -decl, fn);
 + diagnose_name_conflict (decl, fn);
  
 dup = duplicate_decls (decl, fn, is_friend);
 /* If DECL was a redeclaration of FN -- even an invalid
 @@ -2519,7 +2518,7 @@ do_nonmember_using_decl (tree scope, tre
 if (new_fn == old_fn)
   /* The function already exists in the current namespace.  */
   break;
 -   else if (OVL_USED (tmp1))
 +   else if (TREE_CODE (tmp1) == OVERLOAD  OVL_USED (tmp1))
   continue; /* this is a using decl */
 else if (compparms (TYPE_ARG_TYPES (TREE_TYPE (new_fn)),
 TYPE_ARG_TYPES (TREE_TYPE (old_fn
 @@ -2534,7 +2533,7 @@ do_nonmember_using_decl (tree scope, tre
   break;
 else
   {
 -   error (%qD is already declared in this scope, name);
 +   diagnose_name_conflict (new_fn, old_fn);
 break;
   }
   }
 Index: gcc-4_8-test/gcc/testsuite/g++.dg/lookup/using9.C
 ===
 --- gcc-4_8-test.orig/gcc/testsuite/g++.dg/lookup/using9.C
 +++ gcc-4_8-test/gcc/testsuite/g++.dg/lookup/using9.C
 @@ -21,11 +21,11 @@ void h()
f('h');
f(1); // { dg-error ambiguous }
// { dg-message candidate candidate note { target *-*-* } 22 }
 -  void f(int);  // { dg-error previous using declaration }
 +  void f(int);  // { dg-error previous declaration }
  }
  
  void m()
  {
void f(int);
 -  using B::f;   // { dg-error already declared }
 +  using B::f;   // { dg-error previous declaration }
  }
 Index: gcc-4_8-test/gcc/testsuite/g++.dg/overload/using2.C
 ===
 --- gcc-4_8-test.orig/gcc/testsuite/g++.dg/overload/using2.C
 +++ gcc-4_8-test/gcc/testsuite/g++.dg/overload/using2.C
 @@ -45,7 +45,7 @@ using std::C1;
extern C void exit (int) throw ();
extern C void *malloc (__SIZE_TYPE__) throw ()
 __attribute__((malloc));
  
 -  void abort (void) throw ();
 +  void abort (void) throw (); // { dg-message previous }
void _exit (int) throw (); // { dg-error conflicts 

Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 2:54 AM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 I took the lack of answer for this patch as an indication that the patch is 
 too
 big. This is the first patch in a series of three. Its purpose is to create 
 some new
 effective target for architecture having byte swap instructions and make use
 of them in the existing byte swap tests. One effective target is created for
 each size (16, 32 and 64) as not all architectures support byte swap of all
 sizes.

Sorry, I simply queued it in my review queue for stage1 ... it's definitely
something that was high on my wish-list (including of also using
general vector shuffles if available to support even more patterns).

Still on the queue, stay tuned ;)

Richard.

 Here is the gcc/testsuite/ChangeLog entry:

 2014-04-01  Thomas Preud'homme  thomas.preudho...@arm.com

 * lib/target-supports.exp: New effective targets for architectures
 capable of performing byte swap.
 * gcc.dg/optimize-bswapdi-1.c: Convert to new bswap target.
 * gcc.dg/optimize-bswapdi-2.c: Likewise.
 * gcc.dg/optimize-bswapsi-1.c: Likewise.

 The patch is attached to this email. Is this ok for stage1?

 Best regards,

 Thomas


Re: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 9:04 AM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 From: Marc Glisse [mailto:marc.gli...@inria.fr]

 Uh? It does fold a+1-a for me. What it doesn't do is look through the
 definition of b in b-a. Richard+GSoC will supposedly soon provide a
 function that does that.

 Oh right, it's a bit more complex here since the array index is converted
 to an offset first. So the operation is more like:  ((a+1)*cst) - (a*cst).
 Any chances this might be handled at some point? Note that this might
 not be very frequent so it's not very important for this patch.

More like isn't enough to answer this - do you have a testcase?  (usually
these end up in undefined-overflow and/or conversion-to-sizetype issues)

Richard.

 Thanks for the comment.

 Best regards,

 Thomas




RE: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Thomas Preud'homme
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 
 Sorry, I simply queued it in my review queue for stage1 ... it's definitely
 something that was high on my wish-list (including of also using
 general vector shuffles if available to support even more patterns).

Oh great. Anyway, having it split in 3 parts will ease the review for you.

Thanks.

Thomas





Re: [gomp4] Add tables generation

2014-04-02 Thread Thomas Schwinge
Hi!

On Thu, 20 Mar 2014 17:50:13 +0100, Bernd Schmidt ber...@codesourcery.com 
wrote:
 This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
 adds functionality to build function/variable tables that will allow 
 libgomp to look up offload target code based on the address of the 
 corresponding host function. There are two alternatives, one based on 
 named sections, and one based on a target hook when named sections are 
 unavailable (as on ptx).
 
 Committed on gomp-4_0-branch.

 --- gcc/omp-low.c (revision 208706)
 +++ gcc/omp-low.c (working copy)
 @@ -8671,19 +8672,22 @@ expand_omp_target (struct omp_region *re
  }
  
gimple g;
 -  /* FIXME: This will be address of
 - extern char __OPENMP_TARGET__[] __attribute__((visibility (hidden)))
 - symbol, as soon as the linker plugin is able to create it for us.  */
 -  tree openmp_target = build_zero_cst (ptr_type_node);
 +  tree openmp_target
 += build_decl (UNKNOWN_LOCATION, VAR_DECL,
 +   get_identifier (__OPENMP_TARGET__), ptr_type_node);
 +  TREE_PUBLIC (openmp_target) = 1;
 +  DECL_EXTERNAL (openmp_target) = 1;
if (kind == GF_OMP_TARGET_KIND_REGION)
  {
tree fnaddr = build_fold_addr_expr (child_fn);
 -  g = gimple_build_call (builtin_decl_explicit (start_ix), 7,
 -  device, fnaddr, openmp_target, t1, t2, t3, t4);
 +  g = gimple_build_call (builtin_decl_explicit (start_ix), 7, device,
 +  fnaddr, build_fold_addr_expr (openmp_target),
 +  t1, t2, t3, t4);
  }
else
 -g = gimple_build_call (builtin_decl_explicit (start_ix), 6,
 -device, openmp_target, t1, t2, t3, t4);
 +g = gimple_build_call (builtin_decl_explicit (start_ix), 6, device,
 +build_fold_addr_expr (openmp_target),
 +t1, t2, t3, t4);

Committed in r209013:

commit 1f54e08135bd8be59438977b4edbc102e7cef2d7
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Wed Apr 2 08:28:54 2014 +

Handle __OPENMP_TARGET__ symbol for OpenACC offloading functions, too.

gcc/
* omp-low.c (expand_oacc_offload): Handle __OPENMP_TARGET__
symbol.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@209013 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |  5 +
 gcc/omp-low.c  | 14 --
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 1d35b58..8983632 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2014-04-02  Thomas Schwinge  tho...@codesourcery.com
+
+   * omp-low.c (expand_oacc_offload): Handle __OPENMP_TARGET__
+   symbol.
+
 2014-03-20  Thomas Schwinge  tho...@codesourcery.com
 
* gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_OACC_LOOP.
diff --git gcc/omp-low.c gcc/omp-low.c
index a7b93bc..01eda9d 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -5138,13 +5138,15 @@ expand_oacc_offload (struct omp_region *region)
 }
 
   gimple g;
-  /* FIXME: This will be address of
- extern char __OPENMP_TARGET__[] __attribute__((visibility (hidden)))
- symbol, as soon as the linker plugin is able to create it for us.  */
-  tree openmp_target = build_zero_cst (ptr_type_node);
+  tree openmp_target
+= build_decl (UNKNOWN_LOCATION, VAR_DECL,
+ get_identifier (__OPENMP_TARGET__), ptr_type_node);
+  TREE_PUBLIC (openmp_target) = 1;
+  DECL_EXTERNAL (openmp_target) = 1;
   tree fnaddr = build_fold_addr_expr (child_fn);
-  g = gimple_build_call (builtin_decl_explicit (start_ix),
-10, device, fnaddr, openmp_target, t1, t2, t3, t4,
+  g = gimple_build_call (builtin_decl_explicit (start_ix), 10, device,
+fnaddr, build_fold_addr_expr (openmp_target),
+t1, t2, t3, t4,
 t_num_gangs, t_num_workers, t_vector_length);
   gimple_set_location (g, gimple_location (entry_stmt));
   gsi_insert_before (gsi, g, GSI_SAME_STMT);


 +/* Create new symbol containing (address, size) pairs for omp-marked
 +   functions and global variables.  */
 +void
 +omp_finish_file (void)
 +{
 +  struct cgraph_node *node;
 +  struct varpool_node *vnode;
 +  const char *funcs_section_name = .offload_func_table_section;
 +  const char *vars_section_name = .offload_var_table_section;
 +  vectree, va_gc *v_funcs, *v_vars;
 +
 +  vec_alloc (v_vars, 0);
 +  vec_alloc (v_funcs, 0);
 +
 +  [...]
 +  unsigned num_vars = vec_safe_length (v_vars);
 +  unsigned num_funcs = vec_safe_length (v_funcs);
 +  [...]
 +  if (targetm_common.have_named_sections)
 +{
 +  [...]
 +   }
 +  else
 +{
 +  for (unsigned i = 0; i  num_funcs; i++)
 + {
 +   tree it = (*v_funcs)[i];
 +   targetm.record_offload_symbol (it);
 + }  
 +  for (unsigned i = 0; i  num_funcs; i++)
 + {
 +   tree it = 

Re: [gomp4] Add tables generation

2014-04-02 Thread Thomas Schwinge
Hi!

On Wed, 02 Apr 2014 09:34:29 +0200, I wrote:
 On Thu, 20 Mar 2014 17:50:13 +0100, Bernd Schmidt ber...@codesourcery.com 
 wrote:
  This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
  adds functionality to build function/variable tables that will allow 
  libgomp to look up offload target code based on the address of the 
  corresponding host function. There are two alternatives, one based on 
  named sections, and one based on a target hook when named sections are 
  unavailable (as on ptx).
  
  Committed on gomp-4_0-branch.
 
 I see regressions in the libgomp testsuite for configurations where
 offloading is not enabled:
 
 spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ 
 [...]/source/libgomp/testsuite/libgomp.c/for-3.c 
 -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ 
 -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs 
 -I[...]/build/x86_64-unknown-linux-gnu/./libgomp 
 -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 
 -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 
 -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o 
 ./for-3.exe
 /tmp/ccGnT0ei.o: In function `main':
 for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
 collect2: error: ld returned 1 exit status
 
 I suppose that's because [...]

Workaround committed in r209015:

commit 6a015f81a5fafe32cf45656e3de121f4088dbf41
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Wed Apr 2 08:29:17 2014 +

Work around __OPENMP_TARGET__ not being defined for !ENABLE_OFFLOADING.

libgcc/
* crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
NULL.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@209015 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgcc/ChangeLog.gomp | 10 ++
 libgcc/crtstuff.c |  2 ++
 2 files changed, 12 insertions(+)

diff --git libgcc/ChangeLog.gomp libgcc/ChangeLog.gomp
new file mode 100644
index 000..7d08efa
--- /dev/null
+++ libgcc/ChangeLog.gomp
@@ -0,0 +1,10 @@
+2014-04-02  Thomas Schwinge  tho...@codesourcery.com
+
+   * crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
+   NULL.
+
+Copyright (C) 2014 Free Software Foundation, Inc.
+
+Copying and distribution of this file, with or without modification,
+are permitted in any medium without royalty provided the copyright
+notice and this notice are preserved.
diff --git libgcc/crtstuff.c libgcc/crtstuff.c
index cda0bae..79af7f0 100644
--- libgcc/crtstuff.c
+++ libgcc/crtstuff.c
@@ -775,6 +775,8 @@ void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ 
(protected))) =
   _omp_func_table, _omp_funcs_end,
   _omp_var_table, _omp_vars_end
 };
+#else
+void **__OPENMP_TARGET__ __attribute__ ((__visibility__ (protected))) = NULL;
 #endif
 


 Also, I'd suggest to rename __OPENMP_TARGET__ (and similar ones) to
 __GNU_OFFLOAD__ (or similar).  As we're using this offloading stuff for
 both OpenACC and OpenMP target, it makes sense to me to use a generic
 name; we still have the chance to do so now while this stuff is not yet
 in trunk.


Grüße,
 Thomas


pgpMH12KYLnx1.pgp
Description: PGP signature


Re: [PATCH][LTO] Rework -flto-partition=, add =one case

2014-04-02 Thread Richard Biener
On Tue, 1 Apr 2014, Jan Hubicka wrote:

  
  This reworks the option to use the Enum support we have now and
  adds a =one case (to eventually get rid of one LTO operation mode,
  =none ...).  I was tempted to support -flto-partition=number
  and get rid of --param lto-partitions (thereby also supporting =1),
 
 Yep, I preffer to have one switch to chose algorithm and other to set
 its parameter as you do now. At the moment partitioning is quite a non-issue
 since only important IPA passes works on whole thing, but that may change and
 we may want to play with different partitionings.
 (I have plans for that for incremental compilation and other things)

Well, partitioning is important to get a parallel build.

  but that param specifies the maximum number of partitions and
  still uses the balanced algorithm, thus the result would be
  confusing (and of little use I suppose, as opposed to =1 which should
  give you the same answer as =none).
 
 =none still seems somewhat useful - for setups where you do multiple parallel
 compilations it will be faster than WHOPR and it helps developing IPA passes
 since you do not need to worry about WHOPR complexities at start.

True, but as it ends up eating more memory your multiple parallel
compilations may in the end be slower if they run into swap ;)

And you can do simple IPA passes just where IPA-PTA sits now - at LTRANS
level.

 But with the code to bring function bodies at demand, this is less important.
 I believe with passmanager being bit more flexible, the code paths can be
 almost completely shared. Have few patches on this and pass queue reorg for
 next stage1, so will try to push them out.

Yeah, it would be nice to make the flow of compilation somewhat more
obvious that it is now ...

Richard.


[PATCH] Remove stale declaration

2014-04-02 Thread Marek Polacek
I noticed that we declare this function, but its definition was
removed in 2009 by P. Bonzini, thus the decl serves no purpose.

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2014-04-02  Marek Polacek  pola...@redhat.com

* c-common.h (c_expand_expr): Remove declaration.

diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
index 1099b10..24959d8 100644
--- gcc/c-family/c-common.h
+++ gcc/c-family/c-common.h
@@ -928,8 +928,6 @@ extern bool vector_targets_convertible_p (const_tree t1, 
const_tree t2);
 extern bool vector_types_convertible_p (const_tree t1, const_tree t2, bool 
emit_lax_note);
 extern tree c_build_vec_perm_expr (location_t, tree, tree, tree, bool = true);
 
-extern rtx c_expand_expr (tree, rtx, enum machine_mode, int, rtx *);
-
 extern void init_c_lex (void);
 
 extern void c_cpp_builtins (cpp_reader *);

Marek


Re: [PATCH] Remove stale declaration

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 12:36 PM, Marek Polacek pola...@redhat.com wrote:
 I noticed that we declare this function, but its definition was
 removed in 2009 by P. Bonzini, thus the decl serves no purpose.

 Regtested/bootstrapped on x86_64-linux, ok for trunk?

Ok.

Thanks,
Richard.

 2014-04-02  Marek Polacek  pola...@redhat.com

 * c-common.h (c_expand_expr): Remove declaration.

 diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
 index 1099b10..24959d8 100644
 --- gcc/c-family/c-common.h
 +++ gcc/c-family/c-common.h
 @@ -928,8 +928,6 @@ extern bool vector_targets_convertible_p (const_tree t1, 
 const_tree t2);
  extern bool vector_types_convertible_p (const_tree t1, const_tree t2, bool 
 emit_lax_note);
  extern tree c_build_vec_perm_expr (location_t, tree, tree, tree, bool = 
 true);

 -extern rtx c_expand_expr (tree, rtx, enum machine_mode, int, rtx *);
 -
  extern void init_c_lex (void);

  extern void c_cpp_builtins (cpp_reader *);

 Marek


Re: [committed, libjava] XFAIL sourcelocation (PR libgcj/55637) backported to 4.8.3

2014-04-02 Thread Rainer Orth
domi...@lps.ens.fr (Dominique Dhumieres) writes:

 r...@cebitec.uni-bielefeld.de (Rainer Orth) wrote:
 Sure, patch preapproved.

 Commited as r208983:

 2014-04-01  Dominique d'Humieres domi...@lps.ens.fr
 Rainer Orth  r...@cebitec.uni-bielefeld.de

 PR libgcj/55637
 * testsuite/libjava.lang/sourcelocation.xfail: New file.

Btw, the customary format for such a ChangeLog entry is

2014-04-01  Dominique d'Humieres domi...@lps.ens.fr

Backport from mainline
2014-02-20  Rainer Orth  r...@cebitec.uni-bielefeld.de

PR libgcj/55637
* testsuite/libjava.lang/sourcelocation.xfail: New file.

This way, you can easily see when the original went in.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, ARM] Enable tail call optimization for long call

2014-04-02 Thread Jiong Wang


On 25/03/14 15:44, Richard Earnshaw wrote:

On 24/03/14 11:26, Jiong Wang wrote:

This patch enables tail call optimization for long call on arm.

Previously we have too strict check on arm_function_ok_for_sibcall and
be lack of the support on sibcall/sibcall_value expand that long call tail 
oppportunities are lost.

OK for next next stage 1?


I think this is OK for EABI targets (since we can rely on the linker
generating the right form of interworking veneer), but I'm less certain
about other systems (do we still support COFF).

I think I'd prefer the patch to factor in TARGET_AAPCS_BASED and to
assume that if that is true then arbitrary tail-calls are safe.


Hi Richard,

 IMHO, this is actually a tail call optimization, we just need to make 
sure the register which hold the address be caller saved then it will be OK.


 Updated the change log to fix that aarch64 typo.  No modification on 
the patch, but enclose it in this reply to keep wholeness.


 So, is it ok for next stage-1?

 Thanks.

--
Jiong


gcc/
   * config/arm/predicates.md (call_insn_operand): Add long_call check.
   * config/arm/arm.md (sibcall, sibcall_value): Force the address to reg for 
long_call.
   * config/arm/arm.c (arm_function_ok_for_sibcall): Remove long_call 
restriction.

gcc/testsuite
   gcc.target/arm/tail-long-call.c: New test.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index d5f9ff3..8dcdfa8 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6087,11 +6087,6 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
   if (TARGET_VXWORKS_RTP  flag_pic  !targetm.binds_local_p (decl))
 return false;
 
-  /* Cannot tail-call to long calls, since these are out of range of
- a branch instruction.  */
-  if (decl  arm_is_long_call_p (decl))
-return false;
-
   /* If we are interworking and the function is not declared static
  then we can't tail-call it unless we know that it exists in this
  compilation unit (since it might be a Thumb routine).  */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 2ddda02..fe285f0 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -9444,8 +9444,10 @@
   TARGET_32BIT
   
   {
-if (!REG_P (XEXP (operands[0], 0))
-(GET_CODE (XEXP (operands[0], 0)) != SYMBOL_REF))
+if ((!REG_P (XEXP (operands[0], 0))
+	  GET_CODE (XEXP (operands[0], 0)) != SYMBOL_REF)
+	|| (GET_CODE (XEXP (operands[0], 0)) == SYMBOL_REF
+	 arm_is_long_call_p (SYMBOL_REF_DECL (XEXP (operands[0], 0)
  XEXP (operands[0], 0) = force_reg (SImode, XEXP (operands[0], 0));
 
 if (operands[2] == NULL_RTX)
@@ -9462,8 +9464,10 @@
   TARGET_32BIT
   
   {
-if (!REG_P (XEXP (operands[1], 0)) 
-   (GET_CODE (XEXP (operands[1],0)) != SYMBOL_REF))
+if ((!REG_P (XEXP (operands[1], 0))
+	  GET_CODE (XEXP (operands[1], 0)) != SYMBOL_REF)
+	|| (GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF
+	 arm_is_long_call_p (SYMBOL_REF_DECL (XEXP (operands[1], 0)
  XEXP (operands[1], 0) = force_reg (SImode, XEXP (operands[1], 0));
 
 if (operands[3] == NULL_RTX)
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index ce5c9a8..3673343 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -673,5 +673,6 @@
(match_code reg 0)))
 
 (define_predicate call_insn_operand
-  (ior (match_code symbol_ref)
+  (ior (and (match_code symbol_ref)
+	(match_test !arm_is_long_call_p (SYMBOL_REF_DECL (op
(match_operand 0 s_register_operand)))
diff --git a/gcc/testsuite/gcc.target/arm/tail-long-call.c b/gcc/testsuite/gcc.target/arm/tail-long-call.c
new file mode 100644
index 000..9b27468
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/tail-long-call.c
@@ -0,0 +1,12 @@
+/* { dg-skip-if need at least armv5te { *-*-* } { -march=armv[234]* -mthumb } {  } } */
+/* { dg-options -O2 -march=armv5te -marm } */
+/* { dg-final { scan-assembler bx } } */
+/* { dg-final { scan-assembler-not blx } } */
+
+int lcal (int) __attribute__ ((long_call));
+
+int
+dec (int a)
+{
+  return lcal (a);
+}

Re: [PATCH][AARCH64] Support tail indirect function call

2014-04-02 Thread Jiong Wang

^Ping...

Regards,
Jiong

On 18/03/14 14:13, Jiong Wang wrote:

Current, indirect function call prevents tail-call optimization on AArch64.

This patch adapt the fix for PR arm/19599 to AArch64.

Is it ok for next stage 1?

Thanks.

-- Jiong

gcc/

  * config/aarch64/predicates.md (aarch64_call_insn_operand): New
predicate.
  * config/aarch64/constraints.md (Ucs, Usf):  New constraints.
  * config/aarch64/aarch64.md (*sibcall_insn, *sibcall_value_insn):
Adjust for
  tailcalling through registers.
  * config/aarch64/aarch64.h (enum reg_class): New caller save
register class.
  (REG_CLASS_NAMES): Likewise.
  (REG_CLASS_CONTENTS): Likewise.
  * config/aarch64/aarch64.c (aarch64_function_ok_for_sibcall): Allow
tailcalling
  without decls.

gcc/testsuite

  *gcc.target/aarch64/tail-indirect-call.c: New test.


--
Jiong




[PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Markus Trippelsdorf
It is a common mistake to enable both -flto and -fprofile-generate when
building projects. This is not a good idea, because memory use will
skyrocket due to instrumentation. So just warn the user.

OK for next stage1?

2014-04-02  Markus Trippelsdorf  mar...@trippelsdorf.de

* common.opt (fprofile-generate): Add flag.
* opts.c (finish_options): Add new warning.
(common_handle_option): Set flag.

diff --git a/gcc/common.opt b/gcc/common.opt
index 62c72f0d2fbf..61e9adfa0df5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
 Enable correction of flow inconsistent profile data input
 
 fprofile-generate
-Common
+Common Var(flag_profile_generate)
 Enable common options for generating profile info for profile feedback 
directed optimizations
 
 fprofile-generate=
diff --git a/gcc/opts.c b/gcc/opts.c
index fdc903f9271a..b62a0d626d94 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
error_at (loc, only one -flto-partition value can be specified);
 }
 
+  if (opts-x_flag_generate_lto  opts-x_flag_profile_generate)
+warning_at (loc, 0, Enabling both -fprofile-generate and -flto is a bad 
idea.);
+
   /* We initialize opts-x_flag_split_stack to -1 so that targets can set a
  default value if they choose based on other options.  */
   if (opts-x_flag_split_stack == -1)
@@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,
 
 case OPT_fprofile_generate_:
   opts-x_profile_data_prefix = xstrdup (arg);
+  opts-x_flag_profile_generate = true;
   value = true;
   /* No break here - do -fprofile-generate processing. */
 case OPT_fprofile_generate:
-- 
Markus


Re: [PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Marek Polacek
On Wed, Apr 02, 2014 at 01:50:31PM +0200, Markus Trippelsdorf wrote:
 +  if (opts-x_flag_generate_lto  opts-x_flag_profile_generate)
 +warning_at (loc, 0, Enabling both -fprofile-generate and -flto is a bad 
 idea.);

s/Enabling/enabling/ + no dot at the end.

Marek


[PATCHv2][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Markus Trippelsdorf
It is a common mistake to enable both -flto and -fprofile-generate when
building projects. This is not a good idea, because memory use will
skyrocket due to instrumentation. So just warn the user.

OK for next stage1?

2014-04-02  Markus Trippelsdorf  mar...@trippelsdorf.de

* common.opt (fprofile-generate): Add flag.
* opts.c (finish_options): Add new warning.
(common_handle_option): Set flag.

diff --git a/gcc/common.opt b/gcc/common.opt
index 62c72f0d2fbf..61e9adfa0df5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
 Enable correction of flow inconsistent profile data input
 
 fprofile-generate
-Common
+Common Var(flag_profile_generate)
 Enable common options for generating profile info for profile feedback 
directed optimizations
 
 fprofile-generate=
diff --git a/gcc/opts.c b/gcc/opts.c
index fdc903f9271a..581d2e948483 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
error_at (loc, only one -flto-partition value can be specified);
 }
 
+  if (opts-x_flag_generate_lto  opts-x_flag_profile_generate)
+warning_at (loc, 0, enabling both -fprofile-generate and -flto is a bad 
idea);
+
   /* We initialize opts-x_flag_split_stack to -1 so that targets can set a
  default value if they choose based on other options.  */
   if (opts-x_flag_split_stack == -1)
@@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,
 
 case OPT_fprofile_generate_:
   opts-x_profile_data_prefix = xstrdup (arg);
+  opts-x_flag_profile_generate = true;
   value = true;
   /* No break here - do -fprofile-generate processing. */
 case OPT_fprofile_generate:
-- 
Markus


Re: [PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 1:50 PM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:
 It is a common mistake to enable both -flto and -fprofile-generate when
 building projects. This is not a good idea, because memory use will
 skyrocket due to instrumentation. So just warn the user.

 OK for next stage1?

I'd rather see if we can fix the underlying issue.  For example as we
are now instrumenting as IPA pass we can allocate a single
counter array (if the number of global vars is the issue).  Basically
split analysis and instrumentation into two phases for that.

Or even better, do profile instrumentation as real IPA pass.

Richard.

 2014-04-02  Markus Trippelsdorf  mar...@trippelsdorf.de

 * common.opt (fprofile-generate): Add flag.
 * opts.c (finish_options): Add new warning.
 (common_handle_option): Set flag.

 diff --git a/gcc/common.opt b/gcc/common.opt
 index 62c72f0d2fbf..61e9adfa0df5 100644
 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
  Enable correction of flow inconsistent profile data input

  fprofile-generate
 -Common
 +Common Var(flag_profile_generate)
  Enable common options for generating profile info for profile feedback 
 directed optimizations

  fprofile-generate=
 diff --git a/gcc/opts.c b/gcc/opts.c
 index fdc903f9271a..b62a0d626d94 100644
 --- a/gcc/opts.c
 +++ b/gcc/opts.c
 @@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
 gcc_options *opts_set,
 error_at (loc, only one -flto-partition value can be specified);
  }

 +  if (opts-x_flag_generate_lto  opts-x_flag_profile_generate)
 +warning_at (loc, 0, Enabling both -fprofile-generate and -flto is a bad 
 idea.);
 +
/* We initialize opts-x_flag_split_stack to -1 so that targets can set a
   default value if they choose based on other options.  */
if (opts-x_flag_split_stack == -1)
 @@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,

  case OPT_fprofile_generate_:
opts-x_profile_data_prefix = xstrdup (arg);
 +  opts-x_flag_profile_generate = true;
value = true;
/* No break here - do -fprofile-generate processing. */
  case OPT_fprofile_generate:
 --
 Markus


Re: [PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Richard Biener
On Wed, Apr 2, 2014 at 2:07 PM, Richard Biener
richard.guent...@gmail.com wrote:
 On Wed, Apr 2, 2014 at 1:50 PM, Markus Trippelsdorf
 mar...@trippelsdorf.de wrote:
 It is a common mistake to enable both -flto and -fprofile-generate when
 building projects. This is not a good idea, because memory use will
 skyrocket due to instrumentation. So just warn the user.

 OK for next stage1?

 I'd rather see if we can fix the underlying issue.  For example as we
 are now instrumenting as IPA pass we can allocate a single
 counter array (if the number of global vars is the issue).  Basically
 split analysis and instrumentation into two phases for that.

 Or even better, do profile instrumentation as real IPA pass.

Thus, isn't -coverage also facing the same issue?  Thus, is it
really -fprofile-arcs already or only one of the value profiling pieces?

Richard.

 Richard.

 2014-04-02  Markus Trippelsdorf  mar...@trippelsdorf.de

 * common.opt (fprofile-generate): Add flag.
 * opts.c (finish_options): Add new warning.
 (common_handle_option): Set flag.

 diff --git a/gcc/common.opt b/gcc/common.opt
 index 62c72f0d2fbf..61e9adfa0df5 100644
 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
  Enable correction of flow inconsistent profile data input

  fprofile-generate
 -Common
 +Common Var(flag_profile_generate)
  Enable common options for generating profile info for profile feedback 
 directed optimizations

  fprofile-generate=
 diff --git a/gcc/opts.c b/gcc/opts.c
 index fdc903f9271a..b62a0d626d94 100644
 --- a/gcc/opts.c
 +++ b/gcc/opts.c
 @@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
 gcc_options *opts_set,
 error_at (loc, only one -flto-partition value can be specified);
  }

 +  if (opts-x_flag_generate_lto  opts-x_flag_profile_generate)
 +warning_at (loc, 0, Enabling both -fprofile-generate and -flto is a 
 bad idea.);
 +
/* We initialize opts-x_flag_split_stack to -1 so that targets can set a
   default value if they choose based on other options.  */
if (opts-x_flag_split_stack == -1)
 @@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,

  case OPT_fprofile_generate_:
opts-x_profile_data_prefix = xstrdup (arg);
 +  opts-x_flag_profile_generate = true;
value = true;
/* No break here - do -fprofile-generate processing. */
  case OPT_fprofile_generate:
 --
 Markus


Re: [PATCHv2][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Rainer Orth
Markus Trippelsdorf mar...@trippelsdorf.de writes:

 diff --git a/gcc/opts.c b/gcc/opts.c
 index fdc903f9271a..581d2e948483 100644
 --- a/gcc/opts.c
 +++ b/gcc/opts.c
 @@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
 gcc_options *opts_set,
   error_at (loc, only one -flto-partition value can be specified);
  }
  
 +  if (opts-x_flag_generate_lto  opts-x_flag_profile_generate)
 +warning_at (loc, 0, enabling both -fprofile-generate and -flto is a bad 
 idea);

This warning is not very helpful in this form.  Rather say something
like `causes excessive memory consumption' if this is the problem.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Simple enhancements to dumping in ipa.c and ipa-cp.c

2014-04-02 Thread Martin Jambor
Hi,

recently I've been looking into a number of bugs involving
symtab_remove_unreachable_nodes in one way or another and I have
always started by applying the hunk below.  I did this because
distinguishing different symbol nodes only according to their names is
just so inconvenient, especially when compiling C++.  The risk is
minimal and therefore I'd like to propose it to trunk even at this
late stage, although I can of course wait until the next stage1.

The other hunk is something that I think is also useful when looking
into all failures of ipcp_verify_propagated_values like e.g. PR 60727.

I included the patch in a recent bootstrap and testing and it of
course passes.  OK for trunk now?  Or later?

Thanks,

Martin


2014-04-01  Martin Jambor  mjam...@suse.cz

* ipa-cp.c (ipcp_verify_propagated_values): Also dump symtab and
mention gcc_unreachable before failing.
* ipa.c (symtab_remove_unreachable_nodes): Also print order of
removed symbols.

Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -884,8 +884,9 @@ ipcp_verify_propagated_values (void)
{
  if (dump_file)
{
+ dump_symtab (dump_file);
  fprintf (dump_file, \nIPA lattices after constant 
-  propagation:\n);
+  propagation, before gcc_unreachable:\n);
  print_all_lattices (dump_file, true, false);
}
 
Index: src/gcc/ipa.c
===
--- src.orig/gcc/ipa.c
+++ src/gcc/ipa.c
@@ -469,7 +469,7 @@ symtab_remove_unreachable_nodes (bool be
   if (!node-aux)
{
  if (file)
-   fprintf (file,  %s, node-name ());
+   fprintf (file,  %s/%i, node-name (), node-order);
  cgraph_remove_node (node);
  changed = true;
}
@@ -483,7 +483,7 @@ symtab_remove_unreachable_nodes (bool be
  if (node-definition)
{
  if (file)
-   fprintf (file,  %s, node-name ());
+   fprintf (file,  %s/%i, node-name (), node-order);
  node-body_removed = true;
  node-analyzed = false;
  node-definition = false;
@@ -531,7 +531,7 @@ symtab_remove_unreachable_nodes (bool be
   (!flag_ltrans || !DECL_EXTERNAL (vnode-decl)))
{
  if (file)
-   fprintf (file,  %s, vnode-name ());
+   fprintf (file,  %s/%i, vnode-name (), vnode-order);
  varpool_remove_node (vnode);
  changed = true;
}


[PATCH] Disable IPA-SRA for always_inline functions

2014-04-02 Thread Martin Jambor
Hi,

when dealing with a PR yesterday I have noticed that IPA-SRA was
modifying an always_inline function which is useless work since the
function must then be inlined anyway.  Thus I'd like to propose the
following simple change disabling it in such cases.

Included in a bootstrap and testing on x86_64-linux.  OK for trunk now
or in the next stsge1?

Thanks,

Martin


2014-04-01  Martin Jambor  mjam...@suse.cz

* tree-sra.c (ipa_sra_preliminary_function_checks): Skip
always_inline functions.

Index: src/gcc/tree-sra.c
===
--- src.orig/gcc/tree-sra.c
+++ src/gcc/tree-sra.c
@@ -4960,6 +4960,15 @@ ipa_sra_preliminary_function_checks (str
   if (TYPE_ATTRIBUTES (TREE_TYPE (node-decl)))
 return false;
 
+  if (lookup_attribute (always_inline,
+   DECL_ATTRIBUTES (node-decl)) != NULL)
+{
+  if (dump_file)
+   fprintf (dump_file, Allways inline function will be inlined 
+anyway. \n);
+  return false;
+}
+
   return true;
 }
 


Re: [PATCH] Simple enhancements to dumping in ipa.c and ipa-cp.c

2014-04-02 Thread Richard Biener
On Wed, 2 Apr 2014, Martin Jambor wrote:

 Hi,
 
 recently I've been looking into a number of bugs involving
 symtab_remove_unreachable_nodes in one way or another and I have
 always started by applying the hunk below.  I did this because
 distinguishing different symbol nodes only according to their names is
 just so inconvenient, especially when compiling C++.  The risk is
 minimal and therefore I'd like to propose it to trunk even at this
 late stage, although I can of course wait until the next stage1.
 
 The other hunk is something that I think is also useful when looking
 into all failures of ipcp_verify_propagated_values like e.g. PR 60727.
 
 I included the patch in a recent bootstrap and testing and it of
 course passes.  OK for trunk now?  Or later?

I'll leave the actual changes for review by Honza, it's fine at this
stage if he things the changes make sense and are consistent.

Thanks,
Richard.

 Thanks,
 
 Martin
 
 
 2014-04-01  Martin Jambor  mjam...@suse.cz
 
   * ipa-cp.c (ipcp_verify_propagated_values): Also dump symtab and
   mention gcc_unreachable before failing.
   * ipa.c (symtab_remove_unreachable_nodes): Also print order of
   removed symbols.
 
 Index: src/gcc/ipa-cp.c
 ===
 --- src.orig/gcc/ipa-cp.c
 +++ src/gcc/ipa-cp.c
 @@ -884,8 +884,9 @@ ipcp_verify_propagated_values (void)
   {
 if (dump_file)
   {
 +   dump_symtab (dump_file);
 fprintf (dump_file, \nIPA lattices after constant 
 -propagation:\n);
 +propagation, before gcc_unreachable:\n);
 print_all_lattices (dump_file, true, false);
   }
  
 Index: src/gcc/ipa.c
 ===
 --- src.orig/gcc/ipa.c
 +++ src/gcc/ipa.c
 @@ -469,7 +469,7 @@ symtab_remove_unreachable_nodes (bool be
if (!node-aux)
   {
 if (file)
 - fprintf (file,  %s, node-name ());
 + fprintf (file,  %s/%i, node-name (), node-order);
 cgraph_remove_node (node);
 changed = true;
   }
 @@ -483,7 +483,7 @@ symtab_remove_unreachable_nodes (bool be
 if (node-definition)
   {
 if (file)
 - fprintf (file,  %s, node-name ());
 + fprintf (file,  %s/%i, node-name (), node-order);
 node-body_removed = true;
 node-analyzed = false;
 node-definition = false;
 @@ -531,7 +531,7 @@ symtab_remove_unreachable_nodes (bool be
  (!flag_ltrans || !DECL_EXTERNAL (vnode-decl)))
   {
 if (file)
 - fprintf (file,  %s, vnode-name ());
 + fprintf (file,  %s/%i, vnode-name (), vnode-order);
 varpool_remove_node (vnode);
 changed = true;
   }
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


Re: [PATCH] Disable IPA-SRA for always_inline functions

2014-04-02 Thread Richard Biener
On Wed, 2 Apr 2014, Martin Jambor wrote:

 Hi,
 
 when dealing with a PR yesterday I have noticed that IPA-SRA was
 modifying an always_inline function which is useless work since the
 function must then be inlined anyway.  Thus I'd like to propose the
 following simple change disabling it in such cases.
 
 Included in a bootstrap and testing on x86_64-linux.  OK for trunk now
 or in the next stsge1?

Ok for next stage1, but please short-cut the lookup_attribute
with a DECL_DISREGARD_INLINE_LIMITS () check.  Maybe even
abstract this away into a predicate on the cgraph node.

Thanks,
Richard.

 Thanks,
 
 Martin
 
 
 2014-04-01  Martin Jambor  mjam...@suse.cz
 
   * tree-sra.c (ipa_sra_preliminary_function_checks): Skip
   always_inline functions.
 
 Index: src/gcc/tree-sra.c
 ===
 --- src.orig/gcc/tree-sra.c
 +++ src/gcc/tree-sra.c
 @@ -4960,6 +4960,15 @@ ipa_sra_preliminary_function_checks (str
if (TYPE_ATTRIBUTES (TREE_TYPE (node-decl)))
  return false;
  
 +  if (lookup_attribute (always_inline,
 + DECL_ATTRIBUTES (node-decl)) != NULL)
 +{
 +  if (dump_file)
 + fprintf (dump_file, Allways inline function will be inlined 
 +  anyway. \n);
 +  return false;
 +}
 +
return true;
  }
  
 
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


Re: [PATCH][ARM] Handle simple SImode PLUS and MINUS operations in rtx costs

2014-04-02 Thread Kyrill Tkachov
Pinging this for stage1, otherwise I'll forget about it and it'll fall through 
the cracks...


http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01276.html

Thanks,
Kyrill

On 24/03/14 17:21, Kyrill Tkachov wrote:

Hi all,

I noticed that we don't handle simple reg-to-reg arithmetic operations in the
arm rtx cost functions. We should be adding the cost of alu.arith to the costs
of the operands. This patch does that. Since we don't have any cost tables yet
that have a non-zero value for that field it shouldn't affect code-gen for any
current cores.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for next stage1?

Thanks,
Kyrill

2014-03-24  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/arm/arm.c (arm_new_rtx_costs): Handle reg-to-reg PLUS
  and MINUS RTXs.





[PATCH] [ARM] [RFC] Fix longstanding push_minipool_fix ICE (PR49423, lp1296601)

2014-04-02 Thread Charles Baylis
Hi

This patch fixes the push_minipool_fix ICE, which occurs when the ARM
backend encounters a zero/sign extending load from a constant pool.

I don't have a current test case for trunk, lp1296601 has a test case
which affects the linaro-4.8 branch. As far as I know, there has been
no fix for this on trunk.

The approach taken in this patch is to extend each pattern where this
can occur,  so that it triggers a define_split to synthesise a
constant move instead. Some but not all extend patterns have
previously added pool_range attributes to work-around this problem,
this patch removes those, and also fixes the remaining patterns. Some
patterns have slightly more complex workarounds, which I have not yet
analysed, but it seems worth posting the patch at this stage to get
feedback on the general approach.

Tested on arm-unknown-linux-gnueabihf (qemu), bootstrap in progress.

If this looks good, I'll clean it up for a more detailed review.

Thanks
Charles


0001-initial-attempt-at-fixing-push_minipool_fix-ICE.patch
Description: application/download


Re: [Patch, AArch64] Fix shuffle for big-endian.

2014-04-02 Thread Tejas Belagod

Richard Henderson wrote:

On 02/21/2014 08:30 AM, Tejas Belagod wrote:

+  /* If two vectors, we end up with a wierd mixed-endian mode on NEON.  */
+  if (BYTES_BIG_ENDIAN)
+   {
+ if (!d-one_vector_p  d-perm[i]  nunits)
+   {
+ /* Extract the offset.  */
+ elt = d-perm[i]  (nunits - 1);
+ /* Reverse the top half.  */
+ elt = nunits - 1 - elt;
+ /* Offset it by the bottom half.  */
+ elt += nunits;
+   }
+ else
+   elt = nunits - 1 - d-perm[i];
+   }


Isn't this just

  elt = d-perm[i] ^ (nunits - 1);

all the time?  I.e. invert the index within the word,
but leave the word index (nunits) unchanged.



Here is a revised patch. OK for stage-1?

Thanks
Tejas.

2014-04-02  Tejas Belagod  tejas.bela...@yahoo.com

gcc/

* config/aarch64/aarch64.c (aarch64_evpc_tbl): Reverse order of elements
for big-endian.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e839539..d30b79c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8129,7 +8129,15 @@ aarch64_evpc_tbl (struct expand_vec_perm_d *d)
 return false;
 
   for (i = 0; i  nelt; ++i)
-rperm[i] = GEN_INT (d-perm[i]);
+{
+  int nunits = GET_MODE_NUNITS (vmode);
+
+  /* If big-endian and two vectors we end up with a wierd mixed-endian
+mode on NEON.  Reverse the index within each word but not the word
+itself.  */
+  rperm[i] = GEN_INT (BYTES_BIG_ENDIAN ? d-perm[i] ^ (nunits - 1)
+  : d-perm[i]);
+}
   sel = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, rperm));
   sel = force_reg (vmode, sel);
 

Re: [PATCH] aarch64 suuport for libitm

2014-04-02 Thread Richard Henderson
On 04/01/2014 03:41 PM, Andrew Pinski wrote:
 On Tue, Apr 1, 2014 at 3:24 PM, Richard Henderson r...@redhat.com wrote:
 Comments?  If approved, should this go in for 4.9, or wait for stage1?
 Certainly it's self-contained...
 
 On Cavium's thunder processor the cache line size is going to be
 bigger than 64 bytes, what is your solution to improve performance on
 target's like Thunder?

We can expand the number reasonably.  The only thing it controls is layout of
some of the internal data structures to attempt to put different locks on
different lines.

Is 128 big enough for Thunder?  Honestly, I may well not even have it right for
the processor we have in house.  I didn't bother trying to track down docs to
find out.

 Also I think the default page size for most Linux distros is going to
 be 64k on aarch64 including Redhat Linux so it makes sense not to
 define FIXED_PAGE_SIZE.

Heh.  It turns out these page size defines aren't used any more at all.  During
one of the rewrites we must have delete the bits that used it.  I'll get rid of
all of them so as to be less confusing.

 I will implement the ILP32 version of this patch once it goes in,
 there needs a few changes in gtm_jmpbuf due to long and pointers being
 32bit but the assembly storing 64bits always.

I can minimize those changes now by using unsigned long long...


r~



Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Joseph S. Myers
On Wed, 2 Apr 2014, Thomas Preud'homme wrote:

 +   if { [is-effective-target bswap]
 + ![istarget x86_64-*-*] } {

That x86_64-*-* test is wrong.  x86_64-*-* and i?86-*-* should always be 
handled the same (if you then want to distinguish 32-bit and 64-bit 
multilibs, you check the appropriate effective-target there, depending on 
whether the condition is one on the ABI or which register size is being 
used, which affects how x32 should be counted).


-- 
Joseph S. Myers
jos...@codesourcery.com


[4.8, PATCH 27/26] Backport Power8 and LE support: Fixes for AIX test failures

2014-04-02 Thread Bill Schmidt
Hi,

This patch (diff-aix) adds to the 4.8 PowerPC backport patch series with
a few backported fixes from trunk that repair test failures on AIX.

Thanks,
Bill


[gcc]

2014-04-02  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline r205308
2013-11-23  David Edelsohn  dje@gmail.com

* config/rs6000/rs6000.c (IN_NAMED_SECTION): New macro.
(rs6000_xcoff_select_section): Place decls with stricter alignment
into named sections.
(rs6000_xcoff_unique_section): Allow unique sections for
uninitialized data with strict alignment.

[gcc/testsuite]

2014-04-02  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-04-05  David Edelsohn  dje@gmail.com

* gcc.target/powerpc/sd-vsx.c: Skip on AIX.
* gcc.target/powerpc/sd-pwr6.c: Same.


Index: gcc-4_8-test2/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test2.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test2/gcc/config/rs6000/rs6000.c
@@ -29165,10 +29165,23 @@ rs6000_xcoff_asm_named_section (const ch
   name, suffix[smclass], flags  SECTION_ENTSIZE);
 }
 
+#define IN_NAMED_SECTION(DECL) \
+  ((TREE_CODE (DECL) == FUNCTION_DECL || TREE_CODE (DECL) == VAR_DECL) \
+DECL_SECTION_NAME (DECL) != NULL_TREE)
+
 static section *
 rs6000_xcoff_select_section (tree decl, int reloc,
-unsigned HOST_WIDE_INT align ATTRIBUTE_UNUSED)
+unsigned HOST_WIDE_INT align)
 {
+  /* Place variables with alignment stricter than BIGGEST_ALIGNMENT into
+ named section.  */
+  if (align  BIGGEST_ALIGNMENT)
+{
+  resolve_unique_section (decl, reloc, true);
+  if (IN_NAMED_SECTION (decl))
+   return get_named_section (decl, NULL, reloc);
+}
+
   if (decl_readonly_section (decl, reloc))
 {
   if (TREE_PUBLIC (decl))
@@ -29206,10 +29219,12 @@ rs6000_xcoff_unique_section (tree decl,
 {
   const char *name;
 
-  /* Use select_section for private and uninitialized data.  */
+  /* Use select_section for private data and uninitialized data with
+ alignment = BIGGEST_ALIGNMENT.  */
   if (!TREE_PUBLIC (decl)
   || DECL_COMMON (decl)
-  || DECL_INITIAL (decl) == NULL_TREE
+  || (DECL_INITIAL (decl) == NULL_TREE
+  DECL_ALIGN (decl) = BIGGEST_ALIGNMENT)
   || DECL_INITIAL (decl) == error_mark_node
   || (flag_zero_initialized_in_bss
   initializer_zerop (DECL_INITIAL (decl
Index: gcc-4_8-test2/gcc/testsuite/gcc.target/powerpc/sd-pwr6.c
===
--- gcc-4_8-test2.orig/gcc/testsuite/gcc.target/powerpc/sd-pwr6.c
+++ gcc-4_8-test2/gcc/testsuite/gcc.target/powerpc/sd-pwr6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-skip-if  { powerpc*-*-darwin* } { * } {  } } */
+/* { dg-skip-if  { powerpc*-*-darwin* powerpc-ibm-aix* } { * } {  } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options -O2 -mcpu=power6 -mhard-dfp } */
 /* { dg-final { scan-assembler-not   lfiwzx   } } */
Index: gcc-4_8-test2/gcc/testsuite/gcc.target/powerpc/sd-vsx.c
===
--- gcc-4_8-test2.orig/gcc/testsuite/gcc.target/powerpc/sd-vsx.c
+++ gcc-4_8-test2/gcc/testsuite/gcc.target/powerpc/sd-vsx.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-skip-if  { powerpc*-*-darwin* } { * } {  } } */
+/* { dg-skip-if  { powerpc*-*-darwin* powerpc-ibm-aix* } { * } {  } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options -O2 -mcpu=power7 -mhard-dfp } */
 /* { dg-final { scan-assembler-times lfiwzx 2 } } */




[Patch C++] PR57958 RFC

2014-04-02 Thread Dinar Temirbulatov
Hi,
Following change fixes gimple production for lambda function, in the
patch I assumed that constructing COMPOUND_EXPR for the return value
of auto type function resoluted to CLASS_TYPE_P is wrong. Tested
x86_64-pc-linux-gnu by applying to trunk with no new regressions.
Thanks, Dinar.


fix1.patch
Description: Binary data


Re: [PATCH] aarch64 suuport for libitm

2014-04-02 Thread pinskia


 On Apr 2, 2014, at 7:37 AM, Richard Henderson r...@redhat.com wrote:
 
 On 04/01/2014 03:41 PM, Andrew Pinski wrote:
 On Tue, Apr 1, 2014 at 3:24 PM, Richard Henderson r...@redhat.com wrote:
 Comments?  If approved, should this go in for 4.9, or wait for stage1?
 Certainly it's self-contained...
 
 On Cavium's thunder processor the cache line size is going to be
 bigger than 64 bytes, what is your solution to improve performance on
 target's like Thunder?
 
 We can expand the number reasonably.  The only thing it controls is layout of
 some of the internal data structures to attempt to put different locks on
 different lines.
 
 Is 128 big enough for Thunder?  Honestly, I may well not even have it right 
 for
 the processor we have in house.  I didn't bother trying to track down docs to
 find out.

Yes 128 should be enough. 

Thanks,
Andrew

 
 Also I think the default page size for most Linux distros is going to
 be 64k on aarch64 including Redhat Linux so it makes sense not to
 define FIXED_PAGE_SIZE.
 
 Heh.  It turns out these page size defines aren't used any more at all.  
 During
 one of the rewrites we must have delete the bits that used it.  I'll get rid 
 of
 all of them so as to be less confusing.
 
 I will implement the ILP32 version of this patch once it goes in,
 there needs a few changes in gtm_jmpbuf due to long and pointers being
 32bit but the assembly storing 64bits always.
 
 I can minimize those changes now by using unsigned long long...
 
 
 r~
 


Re: RFA: Fix PR rtl-optimization/60651

2014-04-02 Thread Joern Rennecke
On 28 March 2014 10:20, Eric Botcazou ebotca...@adacore.com wrote:
 However, the first call is for blocks with incoming abnormal edges.
 If these are empty, the change as I wrote it yesterday is fine, but not
 when they are non-empty; in that case, we should indeed insert before the
 first instruction in that block.

 OK, so the issue is specific to empty basic blocks and boils down to inserting
 instructions in a FIFO manner into them.

Actually, the issue also applies to abnormal edges where lcm did leave a set -
but these are rare, and my last patch should handle these properly in any event,
by no longer using the NOTE_INSN_BASIC_BLOCK itself unless the block is
empty.

 This can be archived by finding an insert-before position using NEXT_INSN
 on the basic block head; this amounts to the very same insertion place
 as inserting after the basic block head.  Also, we will continue to set no
 location, and use the same bb, because both add_insn_before and
 add_insn_after (in contradiction to its block comment) will infer the basic
 block from the insn given (in the case for add_insn_before, I assume
 that the basic block doesn't start with a BARRIER - that would be invalid -
 and that the insn it starts with has a valid BLOCK_FOR_INSN setting the
 same way the basic block head has.

 This looks reasonable, but I think that we need more commentary because it's
 not straightforward to understand, so I would:

   1. explicitly state that we enforce an order on the entities in addition to
 the order on priority, both in the code (for example create a 4th paragraph in
 the comment at the top of the file, before More details ...) and in the doc
 as you already did, but ordering the two orders for the sake of clarity:
 first the order on priority then, for the same priority, the order to the
 entities.

Actually, all the patch provides is a partial order, just as I stated.
Providing the strict order you describe would require adding another
loop nesting to the entity/basic block/seginfo loop, and it wouldn't
really be useful for targets.
To order by entity first, then by priority, could be useful for some targets,
so that they can express a dependency chain of mode switching events
to be computed in a single lcm pass without inflating the mode count
(which determines how often we have to invoke the lcm machinery).
However, that would require having separate buckets for each entity for
each  insert_insn_on_edge point.

For epiphany,  EPIPHANY_MSW_ENTITY_FPU_OMNIBUS (for -O0) and
EPIPHANY_MSW_ENTITY_ROUND_KNOWN (used when optimizing)
depend on EPIPHANY_MSW_ENTITY_AND,  EPIPHANY_MSW_ENTITY_OR and
EPIPHANY_MSW_ENTITY_CONFIG.
The latter three only have two modes, an the former two use the
enum attr_fp_mode values, the first of which is FP_MODE_ROUND_UNKNOWN.
That value does not actually appear as a needed mode for these entities, hence
the partial order is sufficient.

EPIPHANY_MSW_ENTITY_FPU_OMNIBUS also depends on EPIPHANY_MSW_ENTITY_OR.

   2. add a line in the head comment of new_seginfo saying that INSN may not be
 a NOTE_BASIC_BLOCK, unless BB is empty.

   3. add a comment above the trick in optimize_mode_switching saying that it
 is both required to implement the FIFO insertion and valid because we know
 that the basic block was initially empty.

Done.

 It's not clear to me whether this is a regression or not, so you'll also need
 to run it by the RMs.

I don't think it's a regression.
2014-04-02  Joern Rennecke  joern.renne...@embecosm.com

gcc:
PR rtl-optimization/60651
* mode-switching.c (optimize_mode_switching): Make sure to emit
sets of a lower numbered entity before sets of a higher numbered
entity to a mode of the same or lower priority.
(new_seginfo): Document and enforce requirement that
NOTE_INSN_BASIC_BLOCK only appears for empty blocks.
* doc/tm.texi.in: Document ordering constraint for emitted mode sets.
* doc/tm.texi: Regenerate.
gcc/testsuite:
PR rtl-optimization/60651
* gcc.target/epiphany/mode-switch.c: New test.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f7024a7..b8ca17e 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9778,6 +9778,8 @@ for @var{entity}.  For any fixed @var{entity}, 
@code{mode_priority_to_mode}
 Generate one or more insns to set @var{entity} to @var{mode}.
 @var{hard_reg_live} is the set of hard registers live at the point where
 the insn(s) are to be inserted.
+Sets of a lower numbered entity will be emitted before sets of a higher
+numbered entity to a mode of the same or lower priority.
 @end defmac
 
 @node Target Attributes
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 6dcbde4..d793d26 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7447,6 +7447,8 @@ for @var{entity}.  For any fixed @var{entity}, 
@code{mode_priority_to_mode}
 Generate one or more insns to set @var{entity} to @var{mode}.
 @var{hard_reg_live} is the set of hard registers live at 

Re: [PATCH] Simple enhancements to dumping in ipa.c and ipa-cp.c

2014-04-02 Thread Jan Hubicka
 On Wed, 2 Apr 2014, Martin Jambor wrote:
 
  Hi,
  
  recently I've been looking into a number of bugs involving
  symtab_remove_unreachable_nodes in one way or another and I have
  always started by applying the hunk below.  I did this because
  distinguishing different symbol nodes only according to their names is
  just so inconvenient, especially when compiling C++.  The risk is
  minimal and therefore I'd like to propose it to trunk even at this
  late stage, although I can of course wait until the next stage1.
  
  The other hunk is something that I think is also useful when looking
  into all failures of ipcp_verify_propagated_values like e.g. PR 60727.
  
  I included the patch in a recent bootstrap and testing and it of
  course passes.  OK for trunk now?  Or later?
 
 I'll leave the actual changes for review by Honza, it's fine at this
 stage if he things the changes make sense and are consistent.

It seems fine to me...
 
 Thanks,
 Richard.
 
  Thanks,
  
  Martin
  
  
  2014-04-01  Martin Jambor  mjam...@suse.cz
  
  * ipa-cp.c (ipcp_verify_propagated_values): Also dump symtab and
  mention gcc_unreachable before failing.
  * ipa.c (symtab_remove_unreachable_nodes): Also print order of
  removed symbols.
  
  Index: src/gcc/ipa-cp.c
  ===
  --- src.orig/gcc/ipa-cp.c
  +++ src/gcc/ipa-cp.c
  @@ -884,8 +884,9 @@ ipcp_verify_propagated_values (void)
  {
if (dump_file)
  {
  + dump_symtab (dump_file);
fprintf (dump_file, \nIPA lattices after constant 
  -  propagation:\n);
  +  propagation, before gcc_unreachable:\n);

This means before symtab_remove_unreachable_nodes?

Honza
print_all_lattices (dump_file, true, false);
  }
   
  Index: src/gcc/ipa.c
  ===
  --- src.orig/gcc/ipa.c
  +++ src/gcc/ipa.c
  @@ -469,7 +469,7 @@ symtab_remove_unreachable_nodes (bool be
 if (!node-aux)
  {
if (file)
  -   fprintf (file,  %s, node-name ());
  +   fprintf (file,  %s/%i, node-name (), node-order);
cgraph_remove_node (node);
changed = true;
  }
  @@ -483,7 +483,7 @@ symtab_remove_unreachable_nodes (bool be
if (node-definition)
  {
if (file)
  -   fprintf (file,  %s, node-name ());
  +   fprintf (file,  %s/%i, node-name (), node-order);
node-body_removed = true;
node-analyzed = false;
node-definition = false;
  @@ -531,7 +531,7 @@ symtab_remove_unreachable_nodes (bool be
 (!flag_ltrans || !DECL_EXTERNAL (vnode-decl)))
  {
if (file)
  -   fprintf (file,  %s, vnode-name ());
  +   fprintf (file,  %s/%i, vnode-name (), vnode-order);
varpool_remove_node (vnode);
changed = true;
  }
  
  
 
 -- 
 Richard Biener rguent...@suse.de
 SUSE / SUSE Labs
 SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


Re: [PATCH] Simple enhancements to dumping in ipa.c and ipa-cp.c

2014-04-02 Thread Martin Jambor
Hi,

On Wed, Apr 02, 2014 at 06:08:27PM +0200, Jan Hubicka wrote:
  On Wed, 2 Apr 2014, Martin Jambor wrote:
  
   Hi,
   
   recently I've been looking into a number of bugs involving
   symtab_remove_unreachable_nodes in one way or another and I have
   always started by applying the hunk below.  I did this because
   distinguishing different symbol nodes only according to their names is
   just so inconvenient, especially when compiling C++.  The risk is
   minimal and therefore I'd like to propose it to trunk even at this
   late stage, although I can of course wait until the next stage1.
   
   The other hunk is something that I think is also useful when looking
   into all failures of ipcp_verify_propagated_values like e.g. PR 60727.
   
   I included the patch in a recent bootstrap and testing and it of
   course passes.  OK for trunk now?  Or later?
  
  I'll leave the actual changes for review by Honza, it's fine at this
  stage if he things the changes make sense and are consistent.
 
 It seems fine to me...

Thanks, I will commit it shortly then.

  
  Thanks,
  Richard.
  
   Thanks,
   
   Martin
   
   
   2014-04-01  Martin Jambor  mjam...@suse.cz
   
 * ipa-cp.c (ipcp_verify_propagated_values): Also dump symtab and
 mention gcc_unreachable before failing.
 * ipa.c (symtab_remove_unreachable_nodes): Also print order of
 removed symbols.
   
   Index: src/gcc/ipa-cp.c
   ===
   --- src.orig/gcc/ipa-cp.c
   +++ src/gcc/ipa-cp.c
   @@ -884,8 +884,9 @@ ipcp_verify_propagated_values (void)
 {
   if (dump_file)
 {
   +   dump_symtab (dump_file);
   fprintf (dump_file, \nIPA lattices after constant 
   -propagation:\n);
   +propagation, before gcc_unreachable:\n);
 
 This means before symtab_remove_unreachable_nodes?

No, there is litrally a call to gcc_unreachable just below this
dumping.  I added this to grep for it easily when I have a number of
dumps lying around because there is the same string in normal dumps
too.

Thanks,

Martin


Re: RFA: Fix PR rtl-optimization/60651

2014-04-02 Thread Joern Rennecke
Hmm, the sanity check in new_seginfo caused a boostrap failure
building libjava on x86.
There was a block with CODE_LABEL as basic block head, otherwise empty.


Skip some gcc.target/i386 tests for conflicting -march= options

2014-04-02 Thread Joseph S. Myers
If you test an x86_64 toolchain with -march=bdver3 in the multilib
options, as noted in
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01662.html various test
failures arise from tests whose own -march= in dg-options is
overridden.  This patch adds dg-skip-if to those tests to skip them
for conflicting -march= options, as has been done before for other
tests (obviously, if the option ordering is changed in future in
DejaGnu, such skips may become obsolete or could be conditioned on
DejaGnu version).  (No doubt other -march= options would show up
further tests needing such changes.)

Tested x86_64-linux-gnu.  OK to commit?

2014-04-02  Joseph Myers  jos...@codesourcery.com

* gcc.target/i386/funcspec-2.c, gcc.target/i386/funcspec-3.c,
gcc.target/i386/funcspec-9.c, gcc.target/i386/isa-1.c,
gcc.target/i386/memcpy-strategy-1.c,
gcc.target/i386/memcpy-strategy-2.c,
gcc.target/i386/memcpy-vector_loop-1.c,
gcc.target/i386/memcpy-vector_loop-2.c,
gcc.target/i386/memset-vector_loop-1.c,
gcc.target/i386/memset-vector_loop-2.c,
gcc.target/i386/sse2-init-v2di-2.c, gcc.target/i386/ssetype-1.c,
gcc.target/i386/ssetype-2.c, gcc.target/i386/ssetype-5.c: Skip for
-march= options different from those in dg-options.

Index: gcc/testsuite/gcc.target/i386/memcpy-vector_loop-2.c
===
--- gcc/testsuite/gcc.target/i386/memcpy-vector_loop-2.c(revision 
209023)
+++ gcc/testsuite/gcc.target/i386/memcpy-vector_loop-2.c(working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if  { i?86-*-* x86_64-*-* } { -march=* } { -march=atom } } 
*/
 /* { dg-options -O2 -march=atom -minline-all-stringops 
-mstringop-strategy=vector_loop } */
 /* { dg-final { scan-assembler-times movdqa 4} } */
 
Index: gcc/testsuite/gcc.target/i386/ssetype-1.c
===
--- gcc/testsuite/gcc.target/i386/ssetype-1.c   (revision 209023)
+++ gcc/testsuite/gcc.target/i386/ssetype-1.c   (working copy)
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* This test checks for absolute memory operands.  */
 /* { dg-require-effective-target nonpic } */
+/* { dg-skip-if  { i?86-*-* x86_64-*-* } { -march=* } { -march=k8 } } */
 /* { dg-options -O2 -msse2 -march=k8 } */
 /* { dg-final { scan-assembler andpd\[^\\n\]*magic } } */
 /* { dg-final { scan-assembler andnpd\[^\\n\]*magic } } */
Index: gcc/testsuite/gcc.target/i386/ssetype-5.c
===
--- gcc/testsuite/gcc.target/i386/ssetype-5.c   (revision 209023)
+++ gcc/testsuite/gcc.target/i386/ssetype-5.c   (working copy)
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* This test checks for absolute memory operands.  */
 /* { dg-require-effective-target nonpic } */
+/* { dg-skip-if  { i?86-*-* x86_64-*-* } { -march=* } { -march=k8 } } */
 /* { dg-options -O2 -msse2 -march=k8 } */
 /* { dg-final { scan-assembler pand\[^\\n\]*magic } } */
 /* { dg-final { scan-assembler pandn\[^\\n\]*magic } } */
Index: gcc/testsuite/gcc.target/i386/memset-vector_loop-2.c
===
--- gcc/testsuite/gcc.target/i386/memset-vector_loop-2.c(revision 
209023)
+++ gcc/testsuite/gcc.target/i386/memset-vector_loop-2.c(working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if  { i?86-*-* x86_64-*-* } { -march=* } { -march=atom } } 
*/
 /* { dg-options -O2 -march=atom -minline-all-stringops 
-mstringop-strategy=vector_loop } */
 /* { dg-final { scan-assembler-times movdqa 4} } */
 
Index: gcc/testsuite/gcc.target/i386/ssetype-2.c
===
--- gcc/testsuite/gcc.target/i386/ssetype-2.c   (revision 209023)
+++ gcc/testsuite/gcc.target/i386/ssetype-2.c   (working copy)
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if  { i?86-*-* x86_64-*-* } { -march=* } { -march=k8 } } */
 /* { dg-options -O2 -msse2 -march=k8 } */
 /* { dg-final { scan-assembler andpd } } */
 /* { dg-final { scan-assembler andnpd } } */
Index: gcc/testsuite/gcc.target/i386/funcspec-9.c
===
--- gcc/testsuite/gcc.target/i386/funcspec-9.c  (revision 209023)
+++ gcc/testsuite/gcc.target/i386/funcspec-9.c  (working copy)
@@ -1,5 +1,6 @@
 /* Test whether using target specific options, we can generate FMA4 code.  */
 /* { dg-do compile } */
+/* { dg-skip-if  { i?86-*-* x86_64-*-* } { -march=* } { -march=k8 } } */
 /* { dg-options -O2 -march=k8 -mfpmath=sse -msse2 } */
 
 extern void exit (int);
Index: gcc/testsuite/gcc.target/i386/funcspec-2.c
===
--- gcc/testsuite/gcc.target/i386/funcspec-2.c  (revision 209023)
+++ gcc/testsuite/gcc.target/i386/funcspec-2.c  (working copy)
@@ -1,5 +1,6 @@
 /* Test whether using target specific options, we 

Re: [PATCH] [ARM] [RFC] Fix longstanding push_minipool_fix ICE (PR49423, lp1296601)

2014-04-02 Thread Charles Baylis
On 2 April 2014 14:29, Charles Baylis charles.bay...@linaro.org wrote:
 Tested on arm-unknown-linux-gnueabihf (qemu), bootstrap in progress.

bootstrapped successfully on a Chromebook arm-unknown-linux-gnueabihf.


Re: RFA: Fix PR rtl-optimization/60651

2014-04-02 Thread Joern Rennecke
On 2 April 2014 17:34, Joern Rennecke joern.renne...@embecosm.com wrote:
 Hmm, the sanity check in new_seginfo caused a boostrap failure
 building libjava on x86.
 There was a block with CODE_LABEL as basic block head, otherwise empty.

I've added the testcase - and a bit more detail on this issue - in the PR.

I've attached an updated patch, which skips past the CODE_LABEL.
And this one bootstraps on i686-pc-linuc-gnu.
2014-04-02  Joern Rennecke  joern.renne...@embecosm.com

gcc:
PR rtl-optimization/60651
* mode-switching.c (optimize_mode_switching): Make sure to emit
sets of a lower numbered entity before sets of a higher numbered
entity to a mode of the same or lower priority.
When creating a seginfo for a basic block that starts with a code
label, move the insertion point past the code label.
(new_seginfo): Document and enforce requirement that
NOTE_INSN_BASIC_BLOCK only appears for empty blocks.
* doc/tm.texi.in: Document ordering constraint for emitted mode sets.
* doc/tm.texi: Regenerate.
gcc/testsuite:
PR rtl-optimization/60651
* gcc.target/epiphany/mode-switch.c: New test.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f7024a7..b8ca17e 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -9778,6 +9778,8 @@ for @var{entity}.  For any fixed @var{entity}, 
@code{mode_priority_to_mode}
 Generate one or more insns to set @var{entity} to @var{mode}.
 @var{hard_reg_live} is the set of hard registers live at the point where
 the insn(s) are to be inserted.
+Sets of a lower numbered entity will be emitted before sets of a higher
+numbered entity to a mode of the same or lower priority.
 @end defmac
 
 @node Target Attributes
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 6dcbde4..d793d26 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7447,6 +7447,8 @@ for @var{entity}.  For any fixed @var{entity}, 
@code{mode_priority_to_mode}
 Generate one or more insns to set @var{entity} to @var{mode}.
 @var{hard_reg_live} is the set of hard registers live at the point where
 the insn(s) are to be inserted.
+Sets of a lower numbered entity will be emitted before sets of a higher
+numbered entity to a mode of the same or lower priority.
 @end defmac
 
 @node Target Attributes
diff --git a/gcc/mode-switching.c b/gcc/mode-switching.c
index 88543b2..088156c 100644
--- a/gcc/mode-switching.c
+++ b/gcc/mode-switching.c
@@ -96,12 +96,18 @@ static void make_preds_opaque (basic_block, int);
 
 
 /* This function will allocate a new BBINFO structure, initialized
-   with the MODE, INSN, and basic block BB parameters.  */
+   with the MODE, INSN, and basic block BB parameters.
+   INSN may not be a NOTE_INSN_BASIC_BLOCK, unless it is en empty
+   basic block; that allows us later to insert instructions in a FIFO-like
+   manner.  */
 
 static struct seginfo *
 new_seginfo (int mode, rtx insn, int bb, HARD_REG_SET regs_live)
 {
   struct seginfo *ptr;
+
+  gcc_assert (!NOTE_INSN_BASIC_BLOCK_P (insn)
+ || insn == BB_END (NOTE_BASIC_BLOCK (insn)));
   ptr = XNEW (struct seginfo);
   ptr-mode = mode;
   ptr-insn_ptr = insn;
@@ -534,7 +540,13 @@ optimize_mode_switching (void)
break;
if (e)
  {
-   ptr = new_seginfo (no_mode, BB_HEAD (bb), bb-index, live_now);
+   rtx ins_pos = BB_HEAD (bb);
+   if (LABEL_P (ins_pos))
+ ins_pos = NEXT_INSN (ins_pos);
+   gcc_assert (NOTE_INSN_BASIC_BLOCK_P (ins_pos));
+   if (ins_pos != BB_END (bb))
+ ins_pos = NEXT_INSN (ins_pos);
+   ptr = new_seginfo (no_mode, ins_pos, bb-index, live_now);
add_seginfo (info + bb-index, ptr);
bitmap_clear_bit (transp[bb-index], j);
  }
@@ -733,7 +745,15 @@ optimize_mode_switching (void)
{
  emitted = true;
  if (NOTE_INSN_BASIC_BLOCK_P (ptr-insn_ptr))
-   emit_insn_after (mode_set, ptr-insn_ptr);
+   /* We need to emit the insns in a FIFO-like manner,
+  i.e. the first to be emitted at our insertion
+  point ends up first in the instruction steam.
+  Because we made sure that NOTE_INSN_BASIC_BLOCK is
+  only used for initially empty basic blocks, we
+  can archive this by appending at the end of
+  the block.  */
+   emit_insn_after
+ (mode_set, BB_END (NOTE_BASIC_BLOCK (ptr-insn_ptr)));
  else
emit_insn_before (mode_set, ptr-insn_ptr);
}
--- /dev/null   2014-03-19 18:18:19.244212660 +
+++ b/gcc/testsuite/gcc.target/epiphany/mode-switch.c   2014-03-25 
13:31:41.186140611 +
@@ -0,0 +1,12 @@

[PATCH, libitm] Remove unused PAGE_SIZE macros

2014-04-02 Thread Richard Henderson
As recently pointed out in a thread porting libitm to aarch64, the PAGE_SIZE
and FIXED_PAGE_SIZE macros are unused.  Indeed, not all of the ports actually
defined them at all.

Removed, lest they cause further confusion.


r~
* config/alpha/target.h (PAGE_SIZE, FIXED_PAGE_SIZE): Remove.
* config/arm/target.h, config/sh/target.h: Likewise.
* config/sparc/target.h, config/x86/target.h: Likewise.


diff --git a/libitm/config/alpha/target.h b/libitm/config/alpha/target.h
index 5e23c53..e33f1e1 100644
--- a/libitm/config/alpha/target.h
+++ b/libitm/config/alpha/target.h
@@ -32,10 +32,6 @@ typedef struct gtm_jmpbuf
   unsigned long f[8];
 } gtm_jmpbuf;
 
-/* Alpha generally uses a fixed page size of 8K.  */
-#define PAGE_SIZE  8192
-#define FIXED_PAGE_SIZE1
-
 /* The size of one line in hardware caches (in bytes). */
 #define HW_CACHELINE_SIZE 64
 
diff --git a/libitm/config/arm/target.h b/libitm/config/arm/target.h
index 6a1458e..a909e14 100644
--- a/libitm/config/arm/target.h
+++ b/libitm/config/arm/target.h
@@ -33,10 +33,6 @@ typedef struct gtm_jmpbuf
   unsigned long pc;
 } gtm_jmpbuf;
 
-/* ARM generally uses a fixed page size of 4K.  */
-#define PAGE_SIZE  4096
-#define FIXED_PAGE_SIZE1
-
 /* ??? The size of one line in hardware caches (in bytes). */
 #define HW_CACHELINE_SIZE 64
 
diff --git a/libitm/config/sh/target.h b/libitm/config/sh/target.h
index 6f6ae5f..fbc804c 100644
--- a/libitm/config/sh/target.h
+++ b/libitm/config/sh/target.h
@@ -35,10 +35,6 @@ typedef struct gtm_jmpbuf
 #endif
 } gtm_jmpbuf;
 
-/* SH generally uses a fixed page size of 4K.  */
-#define PAGE_SIZE  4096
-#define FIXED_PAGE_SIZE1
-
 /* ??? The size of one line in hardware caches (in bytes). */
 #define HW_CACHELINE_SIZE 32
 
diff --git a/libitm/config/sparc/target.h b/libitm/config/sparc/target.h
index b127fa4..309dac1 100644
--- a/libitm/config/sparc/target.h
+++ b/libitm/config/sparc/target.h
@@ -29,10 +29,6 @@ typedef struct gtm_jmpbuf
   unsigned long pc;
 } gtm_jmpbuf;
 
-/* UltraSPARC processors generally use a fixed page size of 8K.  */
-#define PAGE_SIZE  8192
-#define FIXED_PAGE_SIZE1
-
 /* The size of one line in hardware caches (in bytes).  We use the primary
cache line size documented for the UltraSPARC T1/T2.  */
 #define HW_CACHELINE_SIZE 16
diff --git a/libitm/config/x86/target.h b/libitm/config/x86/target.h
index 392db48..78a58e7 100644
--- a/libitm/config/x86/target.h
+++ b/libitm/config/x86/target.h
@@ -52,10 +52,6 @@ typedef struct gtm_jmpbuf
 /* x86 doesn't require strict alignment for the basic types.  */
 #define STRICT_ALIGNMENT 0
 
-/* x86 uses a fixed page size of 4K.  */
-#define PAGE_SIZE   4096
-#define FIXED_PAGE_SIZE 1
-
 /* The size of one line in hardware caches (in bytes). */
 #define HW_CACHELINE_SIZE 64
 


[commit, spu] Fix regression (ICE) in g++.dg/torture/pr57499.C

2014-04-02 Thread Ulrich Weigand
Hello,

this fixes the following testsuite regression on spu-elf:
FAIL: g++.dg/torture/pr57499.C  -O1  (internal compiler error)

which was caused by a code path in pad_bb that would simply crash
if the very last active insn in a function happened to be a
blockage.

Tested on spu-elf, committed to mainline.

Bye,
Ulrich


ChangeLog:

* config/spu/spu.c (pad_bb): Do not crash when the last
insn is CODE_FOR_blockage.

Index: gcc/config/spu/spu.c
===
*** gcc/config/spu/spu.c(revision 208964)
--- gcc/config/spu/spu.c(working copy)
*** pad_bb(void)
*** 2064,2070 
}
  hbr_insn = insn;
}
!   if (INSN_CODE (insn) == CODE_FOR_blockage)
{
  if (GET_MODE (insn) == TImode)
PUT_MODE (next_insn, TImode);
--- 2064,2070 
}
  hbr_insn = insn;
}
!   if (INSN_CODE (insn) == CODE_FOR_blockage  next_insn)
{
  if (GET_MODE (insn) == TImode)
PUT_MODE (next_insn, TImode);
-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



[commit, spu] Fix regression (ICE) in gcc.dg/pr48335-2.c

2014-04-02 Thread Ulrich Weigand
Hello,

this fixes the following regressions on spu-elf:
FAIL: gcc.dg/pr48335-2.c (internal compiler error)
FAIL: gcc.dg/pr48335-3.c (internal compiler error)

which are caused by common code calling the insv pattern with a
combination of bitoffset/bitsize that lies partially outside the
underlying target mode, causing an assertion failure in
spu_expand_insv.

The original reason for the bad offset is that the test case
actually has undefined behavior due to storing partically outside
a struct via a misaligned pointer.

Still, the compiler should not ICE, so I've fixed this similar
to what was done on s390 by just rejecting this in the insv
expander and falling back to common code.

Tested on spu-elf, committed to mainline.

Bye,
Ulrich

ChangeLog:

* config/spu/spu.md (insv): Fail if bitoffset+bitsize
lies outside the target mode.

Index: gcc/config/spu/spu.md
===
*** gcc/config/spu/spu.md   (revision 208964)
--- gcc/config/spu/spu.md   (working copy)
***
*** 2851,2857 
  (match_operand:SI 2 const_int_operand ))
(match_operand 3 nonmemory_operand ))]

!   { spu_expand_insv(operands); DONE; })
  
  ;; Simplify a number of patterns that get generated by extv, extzv,
  ;; insv, and loads.
--- 2851,2863 
  (match_operand:SI 2 const_int_operand ))
(match_operand 3 nonmemory_operand ))]

!   {
! if (INTVAL (operands[1]) + INTVAL (operands[2])
!  GET_MODE_BITSIZE (GET_MODE (operands[0])))
!   FAIL;
! spu_expand_insv(operands);
! DONE;
!   })
  
  ;; Simplify a number of patterns that get generated by extv, extzv,
  ;; insv, and loads.
-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [patch] Fix texinfo warnings for doc/gcc.texi [was: Re: doc bugs]

2014-04-02 Thread Tobias Burnus

*PING*

Tobias Burnus wrote:

H.J. Lu wrote:
On Fri, Mar 28, 2014 at 12:41 PM, Mike Stump mikest...@comcast.net 
wrote:

Since we are nearing release, I thought I'd mention I see:
../../gcc/gcc/doc/invoke.texi:1114: warning: node next `Overall 
Options' in menu `C Dialect Options' and in sectioning `Invoking 
G++' differ

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59055


I think one reason that there are (and were) that many warnings is 
that only recently texinfo gained support for diagnosing these issues. 
(Or maybe not that recent but distributions were slow in adapting 
newer texinfo versions.)


Attached is a warning-removal patch.
OK for the trunk?

Regarding invoke.texi: It had (nearly) the same @menu twice, once 
under @chapter where it belongs to and once under a @section where it 
doesn't.


Tobias




RFA: PATCH to add -fno-gnu-unique for c++/60731

2014-04-02 Thread Jason Merrill
Use of STB_GNU_UNIQUE to avoid problems with variable symbols shared 
between two RTLD_LOCAL plugins and a common library dependency causes 
problems with libraries that depend on dlclose/dlopen to reinitialize 
state.  This patch adds a -fno-gnu-unique flag that such libraries can use.


Tested x86_64-pc-linux-gnu.  OK for trunk?
commit e9f123743831274cff1c135cf65bb222507bab32
Author: Jason Merrill ja...@redhat.com
Date:   Wed Apr 2 15:10:32 2014 -0400

	PR c++/60731
	* common.opt (-fno-gnu-unique): Add.
	* config/elfos.h (USE_GNU_UNIQUE_OBJECT): Check it.

diff --git a/gcc/common.opt b/gcc/common.opt
index 62c72f0..2259f29 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1267,6 +1267,10 @@ fgnu-tm
 Common Report Var(flag_tm)
 Enable support for GNU transactional memory
 
+fgnu-unique
+Common Report Var(flag_gnu_unique) Init(1)
+Use STB_GNU_UNIQUE if supported by the assembler
+
 floop-flatten
 Common Ignore
 Does nothing. Preserved for backward compatibility.
diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h
index 1fce701..c1d5553 100644
--- a/gcc/config/elfos.h
+++ b/gcc/config/elfos.h
@@ -287,7 +287,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 /* Write the extra assembler code needed to declare an object properly.  */
 
 #ifdef HAVE_GAS_GNU_UNIQUE_OBJECT
-#define USE_GNU_UNIQUE_OBJECT 1
+#define USE_GNU_UNIQUE_OBJECT flag_gnu_unique
 #else
 #define USE_GNU_UNIQUE_OBJECT 0
 #endif
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index eca4e8f..2e78b8b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1070,6 +1070,7 @@ See S/390 and zSeries Options.
 -ffixed-@var{reg}  -fexceptions @gol
 -fnon-call-exceptions  -fdelete-dead-exceptions  -funwind-tables @gol
 -fasynchronous-unwind-tables @gol
+-fno-gnu-unique @gol
 -finhibit-size-directive  -finstrument-functions @gol
 -finstrument-functions-exclude-function-list=@var{sym},@var{sym},@dots{} @gol
 -finstrument-functions-exclude-file-list=@var{file},@var{file},@dots{} @gol
@@ -22015,6 +22016,20 @@ Generate unwind table in DWARF 2 format, if supported by target machine.  The
 table is exact at each instruction boundary, so it can be used for stack
 unwinding from asynchronous events (such as debugger or garbage collector).
 
+@item -fno-gnu-unique
+@opindex fno-gnu-unique
+On systems with recent GNU assembler and C library, the C++ compiler
+uses the @code{STB_GNU_UNIQUE} binding to make sure that definitions
+of template static data members and static local variables in inline
+functions are unique even in the presence of @code{RTLD_LOCAL}; this
+is necessary to avoid problems with a library used by two different
+@code{RTLD_LOCAL} plugins depending on a definition in one of them and
+therefore disagreeing with the other one about the binding of the
+symbol.  But this causes @code{dlclose} to be ignored for affected
+DSOs; if your program relies on reinitialization of a DSO via
+@code{dlclose} and @code{dlopen}, you can use
+@option{-fno-gnu-unique}.
+
 @item -fpcc-struct-return
 @opindex fpcc-struct-return
 Return ``short'' @code{struct} and @code{union} values in memory like


Re: Skip some gcc.target/i386 tests for conflicting -march= options

2014-04-02 Thread Uros Bizjak
On Wed, Apr 2, 2014 at 6:36 PM, Joseph S. Myers jos...@codesourcery.com wrote:

 If you test an x86_64 toolchain with -march=bdver3 in the multilib
 options, as noted in
 http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01662.html various test
 failures arise from tests whose own -march= in dg-options is
 overridden.  This patch adds dg-skip-if to those tests to skip them
 for conflicting -march= options, as has been done before for other
 tests (obviously, if the option ordering is changed in future in
 DejaGnu, such skips may become obsolete or could be conditioned on
 DejaGnu version).  (No doubt other -march= options would show up
 further tests needing such changes.)

 Tested x86_64-linux-gnu.  OK to commit?

 2014-04-02  Joseph Myers  jos...@codesourcery.com

 * gcc.target/i386/funcspec-2.c, gcc.target/i386/funcspec-3.c,
 gcc.target/i386/funcspec-9.c, gcc.target/i386/isa-1.c,
 gcc.target/i386/memcpy-strategy-1.c,
 gcc.target/i386/memcpy-strategy-2.c,
 gcc.target/i386/memcpy-vector_loop-1.c,
 gcc.target/i386/memcpy-vector_loop-2.c,
 gcc.target/i386/memset-vector_loop-1.c,
 gcc.target/i386/memset-vector_loop-2.c,
 gcc.target/i386/sse2-init-v2di-2.c, gcc.target/i386/ssetype-1.c,
 gcc.target/i386/ssetype-2.c, gcc.target/i386/ssetype-5.c: Skip for
 -march= options different from those in dg-options.

OK.

Thanks,
Uros.


Use -mno-prefer-avx128 in two more tests

2014-04-02 Thread Joseph S. Myers
Two of the tests I noted in
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00036.html did not get
fixed for --with-arch=bdver3 --with-cpu=bdver3 by adding
-mno-prefer-avx128 in fact also show failures for --with-arch=btver2
--with-tune=btver2, and in that case *are* fixed by adding
-mno-prefer-avx128.  Thus, while in those cases there may still be
other tuning issues as noted in
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00052.html (btver2
doesn't enable the flag in question) I think it *is* correct to use
-mno-prefer-avx128 for these two tests, and this patch adds it.

Tested x86_64-linux-gnu.  OK to commit?

2014-04-02  Joseph Myers  jos...@codesourcery.cmo

* gcc.target/i386/avx2-vpand-3.c,
gcc.target/i386/avx256-unaligned-load-2.c: Use -mno-prefer-avx128.

Index: gcc/testsuite/gcc.target/i386/avx2-vpand-3.c
===
--- gcc/testsuite/gcc.target/i386/avx2-vpand-3.c(revision 209023)
+++ gcc/testsuite/gcc.target/i386/avx2-vpand-3.c(working copy)
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options -mavx2 -O2 -ftree-vectorize -save-temps } */
+/* { dg-options -mavx2 -mno-prefer-avx128 -O2 -ftree-vectorize -save-temps } 
*/
 /* { dg-require-effective-target avx2 } */
 
 
Index: gcc/testsuite/gcc.target/i386/avx256-unaligned-load-2.c
===
--- gcc/testsuite/gcc.target/i386/avx256-unaligned-load-2.c (revision 
209023)
+++ gcc/testsuite/gcc.target/i386/avx256-unaligned-load-2.c (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { ! ia32 } } } */
-/* { dg-options -O3 -dp -mavx -mavx256-split-unaligned-load } */
+/* { dg-options -O3 -dp -mavx -mavx256-split-unaligned-load 
-mno-prefer-avx128 } */
 
 void
 avx_test (char **cp, char **ep)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Use -mno-prefer-avx128 in two more tests

2014-04-02 Thread Uros Bizjak
On Wed, Apr 2, 2014 at 10:09 PM, Joseph S. Myers
jos...@codesourcery.com wrote:

 Two of the tests I noted in
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00036.html did not get
 fixed for --with-arch=bdver3 --with-cpu=bdver3 by adding
 -mno-prefer-avx128 in fact also show failures for --with-arch=btver2
 --with-tune=btver2, and in that case *are* fixed by adding
 -mno-prefer-avx128.  Thus, while in those cases there may still be
 other tuning issues as noted in
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00052.html (btver2
 doesn't enable the flag in question) I think it *is* correct to use
 -mno-prefer-avx128 for these two tests, and this patch adds it.

 Tested x86_64-linux-gnu.  OK to commit?

 2014-04-02  Joseph Myers  jos...@codesourcery.cmo

 * gcc.target/i386/avx2-vpand-3.c,
 gcc.target/i386/avx256-unaligned-load-2.c: Use -mno-prefer-avx128.

OK.

Thanks,
Uros.


Re: RFA: RL78: Fix handling of (SUBREG (SYMBOL_REF))

2014-04-02 Thread DJ Delorie

This is OK.  Thanks!


Re: [C++ patch] for C++/52369

2014-04-02 Thread Fabien Chêne
2014-03-31 23:48 GMT+02:00 Jason Merrill ja...@redhat.com:
[...]
 if (permerror (input_location,
default argument given for parameter 
%d of %q#D, i, newdecl))
   permerror (DECL_SOURCE_LOCATION (olddecl),
  previous specification in %q#D here,
  olddecl);

 should the second permerror be a note instead ?


 Yes.

OK to commit the attached patch ?
Tested x86_64 linux, though this piece of code does not seem to be
covered by the testsuite.

2014-04-02  Fabien Chêne  fab...@gcc.gnu.org

* cp/decl.c (duplicate_decls): Check for the return of
permerror before emitting a note.

-- 
Fabien
Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c	(révision 208997)
+++ gcc/cp/decl.c	(copie de travail)
@@ -1737,9 +1737,9 @@ duplicate_decls (tree newdecl, tree oldd
 			if (permerror (input_location,
    default argument given for parameter 
    %d of %q#D, i, newdecl))
-			  permerror (DECL_SOURCE_LOCATION (olddecl),
- previous specification in %q#D here,
- olddecl);
+			  inform (DECL_SOURCE_LOCATION (olddecl),
+  previous specification in %q#D here,
+  olddecl);
 		  }
 		else
 		  {


[BUILD] Ping for Jakub's --with-build-config=bootstrap-asan / bootstrap-ubsan patches

2014-04-02 Thread Tobias Burnus
I would like to ping the following two patches of Jakub. As he wrote in 
PR60667:


The http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01370.html fix is still 
waiting for review, you need that for both 
--with-build-config=bootstrap-ubsan and --with-build-config=bootstrap-asan.


For --with-build-config=bootstrap-asan also the 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01433.html patch is needed, 
plus --with-build-config=bootstrap-asan will only work with 
-disable-werror for now (fix for that expected only in stage1).



Tobias


[Patch, fortran] PR60717 - Wrong code with recursive procedure with unlimited polymorphic dummy argument

2014-04-02 Thread Paul Richard Thomas
Dear All,

This fix, of itself, is quite obvious.  The offset was being set to
zero for array segments, rather than that required for unity valued
lvalues.

I think that the fix could be used to clean up:

trans-expr.c(gfc_trans_alloc_subarray_assign)
trans-expr.c(gfc_trans_pointer_assign)
trans-expr.c(fncall_realloc_result)
trans-array.c(trans_associate_var)

each of which contains calculation of the offset. However, I do not
think that this is the stage to fix things that are not broken!

I propose to keep the PR open as a reminder to look into this.

Bootstrapped and regtested on X86_64/FC17 - OK for trunk and backporting to 4.8?

Paul

 2014-04-12  Paul Thomas  pa...@gcc.gnu.org

PR fortran/58771
* trans.h : Add 'use_offset' bitfield to gfc_se.
* trans-array.c (gfc_conv_expr_descriptor) : Use 'use_offset'
as a trigger to unconditionally recalculate the offset.
trans-expr.c (gfc_conv_intrinsic_to_class) : Use it.
(gfc_conv_procedure_call) : Ditto.

2014-04-02  Paul Thomas  pa...@gcc.gnu.org

PR fortran/58771
* gfortran.dg/unlimited_polymorphic_17.f90 : New test
Index: gcc/fortran/trans-array.c
===
*** gcc/fortran/trans-array.c   (revision 208997)
--- gcc/fortran/trans-array.c   (working copy)
*** gfc_conv_expr_descriptor (gfc_se *se, gf
*** 6807,6813 
  
/* Set offset for assignments to pointer only to zero if it is not
   the full array.  */
!   if (se-direct_byref
   info-ref  info-ref-u.ar.type != AR_FULL)
base = gfc_index_zero_node;
else if (GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
--- 6807,6813 
  
/* Set offset for assignments to pointer only to zero if it is not
   the full array.  */
!   if ((se-direct_byref || se-use_offset)
   info-ref  info-ref-u.ar.type != AR_FULL)
base = gfc_index_zero_node;
else if (GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
*** gfc_conv_expr_descriptor (gfc_se *se, gf
*** 6899,6905 
  base = fold_build2_loc (input_location, MINUS_EXPR,
  TREE_TYPE (base), base, stride);
}
! else if (GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
{
  tmp = gfc_conv_array_lbound (desc, n);
  tmp = fold_build2_loc (input_location, MINUS_EXPR,
--- 6899,6905 
  base = fold_build2_loc (input_location, MINUS_EXPR,
  TREE_TYPE (base), base, stride);
}
! else if (GFC_ARRAY_TYPE_P (TREE_TYPE (desc)) || se-use_offset)
{
  tmp = gfc_conv_array_lbound (desc, n);
  tmp = fold_build2_loc (input_location, MINUS_EXPR,
*** gfc_conv_expr_descriptor (gfc_se *se, gf
*** 6935,6942 
gfc_get_dataptr_offset (loop.pre, parm, desc, offset,
subref_array_target, expr);
  
!   if ((se-direct_byref || GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
   !se-data_not_needed)
{
  /* Set the offset.  */
  gfc_conv_descriptor_offset_set (loop.pre, parm, base);
--- 6935,6943 
gfc_get_dataptr_offset (loop.pre, parm, desc, offset,
subref_array_target, expr);
  
!   if (((se-direct_byref || GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
!se-data_not_needed)
+ || (se-use_offset  base != NULL_TREE))
{
  /* Set the offset.  */
  gfc_conv_descriptor_offset_set (loop.pre, parm, base);
Index: gcc/fortran/trans-expr.c
===
*** gcc/fortran/trans-expr.c(revision 208997)
--- gcc/fortran/trans-expr.c(working copy)
*** gfc_conv_intrinsic_to_class (gfc_se *par
*** 593,598 
--- 593,599 
else
{
  parmse-ss = ss;
+ parmse-use_offset = 1;
  gfc_conv_expr_descriptor (parmse, e);
  gfc_add_modify (parmse-pre, ctree, parmse-expr);
}
*** gfc_conv_procedure_call (gfc_se * se, gf
*** 4378,4383 
--- 4379,4385 
|| CLASS_DATA (fsym)-attr.codimension))
{
  /* Pass a class array.  */
+ parmse.use_offset = 1;
  gfc_conv_expr_descriptor (parmse, e);
  
  /* If an ALLOCATABLE dummy argument has INTENT(OUT) and is
Index: gcc/fortran/trans.h
===
*** gcc/fortran/trans.h (revision 208997)
--- gcc/fortran/trans.h (working copy)
*** typedef struct gfc_se
*** 87,92 
--- 87,96 
   args alias.  */
unsigned force_tmp:1;
  
+   / * Unconditionally calculate offset for array segments in
+   gfc_conv_expr_descriptor.  */
+   unsigned use_offset:1;
+ 
unsigned want_coarray:1;
  
/* Scalarization parameters.  */
Index: 

Re: [C++ patch] for C++/52369

2014-04-02 Thread Jason Merrill

On 04/02/2014 04:21 PM, Fabien Chêne wrote:

 * cp/decl.c (duplicate_decls): Check for the return of
 permerror before emitting a note.


You don't need cp/ within cp/ChangeLog.  OK with that change.

Jason



one more patch to fix PR60650

2014-04-02 Thread Vladimir Makarov

  The following patch fixes the PR for new set of options.

The details of the problem can be found on

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60650

  The patch affects a sensitive part for LRA.  Therefore I bootstrapped 
and tested it on x86-64, aarch64, arm, s390, and Ppc64.  The results 
look ok.


  x86/x86-64 SPEC2000 testing shows no visible effect on performance 
and code size.


  Committed as rev. 209038.

2014-04-02  Vladimir Makarov  vmaka...@redhat.com

PR rtl-optimization/60650
* lra-constraints.c (process_alt_operands): Decrease reject for
earlyclobber matching.

2014-04-02  Vladimir Makarov  vmaka...@redhat.com

PR rtl-optimization/60650
* gcc.target/arm/pr60650-2.c: New.
Index: lra-constraints.c
===
--- lra-constraints.c   (revision 208989)
+++ lra-constraints.c   (working copy)
@@ -1747,12 +1747,27 @@ process_alt_operands (int only_alternati
  [GET_MODE (*curr_id-operand_loc[m])]);
  }
 
-   /* We prefer no matching alternatives because
-  it gives more freedom in RA.  */
-   if (operand_reg[nop] == NULL_RTX
-   || (find_regno_note (curr_insn, REG_DEAD,
-REGNO (operand_reg[nop]))
-== NULL_RTX))
+   /* Prefer matching earlyclobber alternative as
+  it results in less hard regs required for
+  the insn than a non-matching earlyclobber
+  alternative.  */
+   if (curr_static_id-operand[m].early_clobber)
+ {
+   if (lra_dump_file != NULL)
+ fprintf
+   (lra_dump_file,
+%d Matching earlyclobber alt:
+ reject--\n,
+nop);
+   reject--;
+ }
+   /* Otherwise we prefer no matching
+  alternatives because it gives more freedom
+  in RA.  */
+   else if (operand_reg[nop] == NULL_RTX
+|| (find_regno_note (curr_insn, REG_DEAD,
+ REGNO (operand_reg[nop]))
+== NULL_RTX))
  {
if (lra_dump_file != NULL)
  fprintf
@@ -2143,7 +2158,7 @@ process_alt_operands (int only_alternati
}
  /* If the operand is dying, has a matching constraint,
 and satisfies constraints of the matched operand
-which failed to satisfy the own constraints, probably
+which failed to satisfy the own constraints, most probably
 the reload for this operand will be gone.  */
  if (this_alternative_matches = 0
   !curr_alt_win[this_alternative_matches]
Index: testsuite/gcc.target/arm/pr60650-2.c
===
--- testsuite/gcc.target/arm/pr60650-2.c(revision 0)
+++ testsuite/gcc.target/arm/pr60650-2.c(working copy)
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fno-omit-frame-pointer -march=armv7-a } */
+
+int a, h, j;
+long long d, e, i;
+int f;
+fn1 (void *p1, int p2)
+{
+switch (p2)
+case 8:
+{
+register b = *(long long *) p1, c asm (r2);
+asm (%0: =r (a), =r (c):r (b), r (0));
+*(long long *) p1 = c;
+}
+}
+
+fn2 ()
+{
+int k;
+k = f;
+while (1)
+{
+fn1 (i, sizeof i);
+e = d + k;
+switch (d)
+case 0:
+(
+{
+register l asm (r4);
+register m asm (r0);
+asm (  .err  .endif\n\t: =r (h), =r (j):r (m),
+r
+(l));;
+});
+}
+}


[PATCH, committed] Fix PR60733

2014-04-02 Thread Bill Schmidt
PR60733 identifies a case where straight-line strength reduction
produces code that doesn't satisfy SSA verification.  For a PHI
candidate, the insertion of an initializer for a stride calculation
along an incoming arc was specified to be at the point of the feeding
definition of the PHI along that arc.  This is wrong and can place the
initializer far earlier than its operands are guaranteed to be
available.  In this case, the initializer was placed earlier in the
block than the definition of one of its operands.

In fact, the initializer is only needed at the end of the feeding block
for the PHI argument, and its operands are guaranteed to be available at
that point.  This patch changes the placement of the initializer to this
location for PHI candidates.  The nearest common dominator algorithm may
still place the initializer at an earlier point, but only if it is safe
to do so.

Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
regressions; committed.

Thanks,
Bill


[gcc]

2014-04-02  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/60733
* gimple-ssa-strength-reduction.c (ncd_with_phi): Change required
insertion point for PHI candidates to be the end of the feeding
block for the PHI argument.

[gcc/testsuite]

2014-04-02  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/60733
* gcc.dg/torture/pr60733.c:  New test.


Index: gcc/testsuite/gcc.dg/torture/pr60733.c
===
--- gcc/testsuite/gcc.dg/torture/pr60733.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr60733.c  (revision 0)
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+
+int a, d, e, f, g, h, i, j, k;
+unsigned short b;
+
+short
+fn1 (int p1, int p2)
+{
+  return p1 * p2;
+}
+
+int
+main ()
+{
+  for (; a; a--)
+{
+  int l = 0;
+  if (f = 0)
+   {
+ for (; h;)
+   e = 0;
+ for (; l != -6; l--)
+   {
+ j = fn1 (b--, d);
+ for (g = 0; g; g = 1)
+   ;
+ k = e ? 2 : 0;
+   }
+ i = 0;
+ for (;;)
+   ;
+   }
+}
+  d = 0;
+  return 0;
+}
Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 209023)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -3001,10 +3001,10 @@ ncd_with_phi (slsr_cand_t c, double_int incr, gimp
{
  slsr_cand_t arg_cand = base_cand_from_table (arg);
  double_int diff = arg_cand-index - basis-index;
+ basic_block pred = gimple_phi_arg_edge (phi, i)-src;
 
  if ((incr == diff) || (!address_arithmetic_p  incr == -diff))
-   ncd = ncd_for_two_cands (ncd, gimple_bb (arg_cand-cand_stmt),
-*where, arg_cand, where);
+   ncd = ncd_for_two_cands (ncd, pred, *where, NULL, where);
}
}
 }




Re: [PATCH] Disable IPA-SRA for always_inline functions

2014-04-02 Thread Jan Hubicka
 Hi,
 
 when dealing with a PR yesterday I have noticed that IPA-SRA was
 modifying an always_inline function which is useless work since the
 function must then be inlined anyway.  Thus I'd like to propose the
 following simple change disabling it in such cases.
 
 Included in a bootstrap and testing on x86_64-linux.  OK for trunk now
 or in the next stsge1?

Actually are the attributes copied to the clone?
The patch looks OK to me, even at this stage.

Honza
 
 Thanks,
 
 Martin
 
 
 2014-04-01  Martin Jambor  mjam...@suse.cz
 
   * tree-sra.c (ipa_sra_preliminary_function_checks): Skip
   always_inline functions.
 
 Index: src/gcc/tree-sra.c
 ===
 --- src.orig/gcc/tree-sra.c
 +++ src/gcc/tree-sra.c
 @@ -4960,6 +4960,15 @@ ipa_sra_preliminary_function_checks (str
if (TYPE_ATTRIBUTES (TREE_TYPE (node-decl)))
  return false;
  
 +  if (lookup_attribute (always_inline,
 + DECL_ATTRIBUTES (node-decl)) != NULL)
 +{
 +  if (dump_file)
 + fprintf (dump_file, Allways inline function will be inlined 
 +  anyway. \n);
 +  return false;
 +}
 +
return true;
  }
  


Re: [PATCH][LTO/PGO] Warn when both -flto and -fprofile-generate are enabled

2014-04-02 Thread Jan Hubicka
 On Wed, Apr 2, 2014 at 2:07 PM, Richard Biener
 richard.guent...@gmail.com wrote:
  On Wed, Apr 2, 2014 at 1:50 PM, Markus Trippelsdorf
  mar...@trippelsdorf.de wrote:
  It is a common mistake to enable both -flto and -fprofile-generate when
  building projects. This is not a good idea, because memory use will
  skyrocket due to instrumentation. So just warn the user.
 
  OK for next stage1?
 
  I'd rather see if we can fix the underlying issue.  For example as we
  are now instrumenting as IPA pass we can allocate a single
  counter array (if the number of global vars is the issue).  Basically
  split analysis and instrumentation into two phases for that.
 
  Or even better, do profile instrumentation as real IPA pass.
 
 Thus, isn't -coverage also facing the same issue?  Thus, is it
 really -fprofile-arcs already or only one of the value profiling pieces?

Yep, -fprofile-arcs will cause similar issues.
Implementing instrumentation as real IPA is on my TODO list, but pretty low,
since it is quite some work; we need to stream CFG into summaries and make
the instrumentation code independent of function bodies, that needs quite some
reorg (at moment we have no way to load cfg alone).

Note that -fprofile-generate -flto gives you a bit more precise profiles than
-fprofile-generate alone, this is because of COMDAT functions from static 
libraries
that may be lost in the first case.

Honza
 
 Richard.
 
  Richard.
 
  2014-04-02  Markus Trippelsdorf  mar...@trippelsdorf.de
 
  * common.opt (fprofile-generate): Add flag.
  * opts.c (finish_options): Add new warning.
  (common_handle_option): Set flag.
 
  diff --git a/gcc/common.opt b/gcc/common.opt
  index 62c72f0d2fbf..61e9adfa0df5 100644
  --- a/gcc/common.opt
  +++ b/gcc/common.opt
  @@ -1689,7 +1689,7 @@ Common Report Var(flag_profile_correction)
   Enable correction of flow inconsistent profile data input
 
   fprofile-generate
  -Common
  +Common Var(flag_profile_generate)
   Enable common options for generating profile info for profile feedback 
  directed optimizations
 
   fprofile-generate=
  diff --git a/gcc/opts.c b/gcc/opts.c
  index fdc903f9271a..b62a0d626d94 100644
  --- a/gcc/opts.c
  +++ b/gcc/opts.c
  @@ -833,6 +833,9 @@ finish_options (struct gcc_options *opts, struct 
  gcc_options *opts_set,
  error_at (loc, only one -flto-partition value can be specified);
   }
 
  +  if (opts-x_flag_generate_lto  opts-x_flag_profile_generate)
  +warning_at (loc, 0, Enabling both -fprofile-generate and -flto is a 
  bad idea.);
  +
 /* We initialize opts-x_flag_split_stack to -1 so that targets can set 
  a
default value if they choose based on other options.  */
 if (opts-x_flag_split_stack == -1)
  @@ -1728,6 +1731,7 @@ common_handle_option (struct gcc_options *opts,
 
   case OPT_fprofile_generate_:
 opts-x_profile_data_prefix = xstrdup (arg);
  +  opts-x_flag_profile_generate = true;
 value = true;
 /* No break here - do -fprofile-generate processing. */
   case OPT_fprofile_generate:
  --
  Markus


merged trunk into gimple-front-end

2014-04-02 Thread Trevor Saunders
Hi,

I just merged trunk r209020 into the gimple-front-end branch, please
tell me if you see anything busted ;)

I successfully bootstrapped the merge including building the gimple front
end and its few tests passed.

Trev



signature.asc
Description: Digital signature


[Patch, moxie] Zero- and sign-extend values properly

2014-04-02 Thread Anthony Green

This patch does three related things for the moxie port...

1. Changes char to be unsigned by default
2. Changes WCHAR_TYPE from long int to unsigned int
3. Zero- and sign-extends values properly, sometimes using the new
sign-extension instructions.

I am committing this change even at this late stage of the GCC release
process because it only touches the moxie target directory.

AG

2014-04-02  Anthony Green  gr...@moxielogic.com

* config/moxie/moxie.md (zero_extendqisi2, zero_extendhisi2)
(extendqisi2, extendhisi2): Define.
* config/moxie/moxie.h (DEFAULT_SIGNED_CHAR): Change to 0.
(WCHAR_TYPE): Change to unsigned int.


Index: gcc/config/moxie/moxie.h
===
--- gcc/config/moxie/moxie.h(revision 209042)
+++ gcc/config/moxie/moxie.h(working copy)
@@ -59,7 +59,7 @@
 #define DOUBLE_TYPE_SIZE 64
 #define LONG_DOUBLE_TYPE_SIZE 64
 
-#define DEFAULT_SIGNED_CHAR 1
+#define DEFAULT_SIGNED_CHAR 0
 
 #undef  SIZE_TYPE
 #define SIZE_TYPE unsigned int
@@ -68,7 +68,7 @@
 #define PTRDIFF_TYPE int
 
 #undef  WCHAR_TYPE
-#define WCHAR_TYPE long int
+#define WCHAR_TYPE unsigned int
 
 #undef  WCHAR_TYPE_SIZE
 #define WCHAR_TYPE_SIZE BITS_PER_WORD
Index: gcc/config/moxie/moxie.md
===
--- gcc/config/moxie/moxie.md   (revision 209042)
+++ gcc/config/moxie/moxie.md   (working copy)
@@ -239,6 +239,56 @@
ldo.l  %0, %1
   [(set_attr length  2,2,6,2,6,2,6,6,6)])
 
+(define_insn_and_split zero_extendqisi2
+  [(set (match_operand:SI 0 register_operand =r,r,r,r)
+   (zero_extend:SI (match_operand:QI 1 nonimmediate_operand 0,W,A,B)))]
+  
+  @
+   ;
+   ld.b   %0, %1
+   lda.b  %0, %1
+   ldo.b  %0, %1
+  reload_completed
+  [(set (match_dup 2) (match_dup 1))
+   (set (match_dup 0) (zero_extend:SI (match_dup 2)))]
+{
+  operands[2] = gen_lowpart (QImode, operands[0]);
+}
+  [(set_attr length 0,2,6,6)])
+
+(define_insn_and_split zero_extendhisi2
+  [(set (match_operand:SI 0 register_operand =r,r,r,r)
+   (zero_extend:SI (match_operand:HI 1 nonimmediate_operand 0,W,A,B)))]
+  
+  @
+   ;
+   ld.s   %0, %1
+   lda.s  %0, %1
+   ldo.s  %0, %1
+  reload_completed
+  [(set (match_dup 2) (match_dup 1))
+   (set (match_dup 0) (zero_extend:SI (match_dup 2)))]
+{
+  operands[2] = gen_lowpart (HImode, operands[0]);
+}
+  [(set_attr length 0,2,6,6)])
+
+(define_insn extendqisi2
+  [(set (match_operand:SI 0 register_operand =r)
+   (sign_extend:SI (match_operand:QI 1 nonimmediate_operand r)))]
+  
+  @
+   sex.b  %0, %1
+  [(set_attr length 2)])
+
+(define_insn extendhisi2
+  [(set (match_operand:SI 0 register_operand =r)
+   (sign_extend:SI (match_operand:HI 1 nonimmediate_operand r)))]
+  
+  @
+   sex.s  %0, %1
+  [(set_attr length 2)])
+
 (define_expand movqi
   [(set (match_operand:QI 0 general_operand )
(match_operand:QI 1 general_operand ))]


Fix ipa-devirt ICE

2014-04-02 Thread Jan Hubicka
Hi,
this patch fixes ICE on type inconsistent code.  The ICE happens because of
gcc_unreachable I forgot in code during development.  I added way to mark calls
as inconsistent that is useful to redirect them to UNREACHABLE.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* testsuite/g++.dg/torture/pr60659.C: New testcase.
* ipa-devirt.c (get_polymorphic_call_info): Do not ICE on type 
inconsistent
code and instead mark the context inconsistent.
(possible_polymorphic_call_targets): For inconsistent contexts
return empty complete list.
Index: testsuite/g++.dg/torture/pr60659.C
===
--- testsuite/g++.dg/torture/pr60659.C  (revision 0)
+++ testsuite/g++.dg/torture/pr60659.C  (revision 0)
@@ -0,0 +1,58 @@
+// { dg-do compile }
+template typename _InputIterator void __distance (_InputIterator);
+template typename _InputIterator
+void distance (_InputIterator, _InputIterator p2)
+{
+  __distance (p2);
+}
+
+namespace boost
+{
+template class Iterator struct A
+{
+  typedef typename Iterator::difference_type type;
+};
+template class T typename T::const_iterator end (T );
+template class T typename T::const_iterator begin (T );
+template class T struct D : Atypename T::const_iterator
+{
+};
+template class T typename DT::type distance (const T p1)
+{
+  distance (boost::begin (p1), boost::end (p1));
+  return 0;
+}
+template class IteratorT class B
+{
+public:
+  typedef B type;
+  typedef IteratorT const_iterator;
+};
+}
+
+typedef int storage_t[];
+struct F;
+template template typename class struct G
+{
+  G (const G p1) { p1.m_fn1 ().m_fn1 (0); }
+  const F m_fn1 () const
+  {
+const void *a;
+a = data_m;
+return *static_castconst F *(a);
+  }
+  storage_t *data_m;
+};
+
+struct F
+{
+  virtual F *m_fn1 (void *) const;
+};
+template typename struct H;
+struct C : GH
+{
+  typedef int difference_type;
+};
+boost::BC AllTransVideos ();
+int b = boost::distance (AllTransVideos ());
+
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 208915)
+++ ipa-devirt.c(working copy)
@@ -1214,7 +1214,13 @@ get_polymorphic_call_info (tree fndecl,
 not part of outer type.  */
  if (!contains_type_p (TREE_TYPE (base),
context-offset + offset2, *otr_type))
-   return base_pointer;
+   {
+ /* Use OTR_TOKEN = INT_MAX as a marker of probably type 
inconsistent
+code sequences; we arrange the calls to be 
builtin_unreachable
+later.  */
+ *otr_token = INT_MAX;
+ return base_pointer;
+   }
  get_polymorphic_call_info_for_decl (context, base,
  context-offset + 
offset2);
  return NULL;
@@ -1288,8 +1294,10 @@ get_polymorphic_call_info (tree fndecl,
  if (!contains_type_p (context-outer_type, context-offset,
*otr_type))
{ 
- context-outer_type = NULL;
- gcc_unreachable ();
+ /* Use OTR_TOKEN = INT_MAX as a marker of probably type 
inconsistent
+code sequences; we arrange the calls to be builtin_unreachable
+later.  */
+ *otr_token = INT_MAX;
  return base_pointer;
}
  context-maybe_derived_type = false;
@@ -1389,6 +1397,9 @@ devirt_variable_node_removal_hook (varpo
temporarily change to one of base types.  INCLUDE_DERIVER_TYPES make
us to walk the inheritance graph for all derivations.
 
+   OTR_TOKEN == INT_MAX is used to mark calls that are provably
+   undefined and should be redirected to unreachable.
+
If COMPLETEP is non-NULL, store true if the list is complete. 
CACHE_TOKEN (if non-NULL) will get stored to an unique ID of entry
in the target cache.  If user needs to visit every target list
@@ -1422,6 +1433,7 @@ possible_polymorphic_call_targets (tree
   bool complete;
   bool can_refer;
 
+  /* If ODR is not initialized, return empty incomplete list.  */
   if (!odr_hash.is_created ())
 {
   if (completep)
@@ -1431,11 +1443,28 @@ possible_polymorphic_call_targets (tree
   return nodes;
 }
 
+  /* If we hit type inconsistency, just return empty list of targets.  */
+  if (otr_token == INT_MAX)
+{
+  if (completep)
+   *completep = true;
+  if (nonconstruction_targetsp)
+   *nonconstruction_targetsp = 0;
+  return nodes;
+}
+
   type = get_odr_type (otr_type, true);
 
   /* Lookup the outer class type we want to walk.  */
-  if (context.outer_type)
-get_class_context (context, otr_type);
+  if (context.outer_type
+   !get_class_context (context, otr_type))
+{
+  if (completep)
+  

RE: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Thomas Preud'homme
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 
 More like isn't enough to answer this - do you have a testcase?  (usually
 these end up in undefined-overflow and/or conversion-to-sizetype issues)

I do. See attachment. This testcase needs to be compiled with patch 2/3
applied. As you can see from the patch, data[a] and data[a+1] will be
converted to offsets by multiplying the index with the element size. Then
later, analyzing the ORing, a substraction of these two index will be done.
So you have two fold_build and not one. I can't reproduce it with a simple
expression such as (a+1)*1 - a*1 so maybe being done in two part is the
reason, you know better.

Best regards,

Thomas

missed_folding.c
Description: Binary data


RE: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store

2014-04-02 Thread Thomas Preud'homme
 From: Joseph Myers [mailto:jos...@codesourcery.com]
 
  +   if { [is-effective-target bswap]
  + ![istarget x86_64-*-*] } {
 
 That x86_64-*-* test is wrong.  x86_64-*-* and i?86-*-* should always be
 handled the same (if you then want to distinguish 32-bit and 64-bit
 multilibs, you check the appropriate effective-target there, depending on
 whether the condition is one on the ABI or which register size is being
 used, which affects how x32 should be counted).

Indeed, it's a mistake. I?86 should be in there two. Please find attached an 
updated patch.

Best regards,

Thomas 


gcc32rm-84.3.1.part1.diff
Description: Binary data