Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-28 Thread Jakub Jelinek
On Mon, Mar 28, 2016 at 07:38:46PM -0500, Bill Schmidt wrote:
> For a long time we've had hundreds of failing guality tests.  These
> failures don't seem to have any correlation with gdb functionality for
> POWER, which is working fine.  At this point the value of these tests to
> us seems questionable.  Fixing these is such low priority that it is
> unlikely we will ever get around to it.  In the meanwhile, the failures
> simply clutter up our regression test reports.  Thus I'd like to disable
> them, and that's what this test does.
> 
> Verified to remove hundreds of failure messages on
> powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?

This is IMNSHO very wrong, you then lose tracking of regressions in the
debug info quality.  It is true that the debug info quality is already
pretty bad  on powerpc*, it would be really very much desirable if
anyone had time to analyze some of them and improve stuff,
but we at least shouldn't regress.  Guality testsuite has various FAILs
and/or XFAILs on lots of architectures, the problem is that the testing
matrix is simply too large to have them in the testcases
- it depends on the target, various ISA settings on the target, on the
optimization level (most of the guality tests are torture tested through
-O0 up to -O3 with extra flags), and in some cases also on the version of
the used GDB.

For guality, the most effective test for regressions is simply always
running contrib/test_summary after all your bootstraps and then just
diffing up that against the same from earlier bootstrap.

Jakub


Re: [PATCH, lower-subreg] Fix 70355

2016-03-28 Thread Jakub Jelinek
On Mon, Mar 28, 2016 at 07:57:54PM -0700, Richard Henderson wrote:
> The ICE comes fixing up a debug_insn, and the debug info incoming to the
> pass seems reasonable.  Just recognizing that this situation is possible and
> not asserting appears to work.
> 
> Ok?
> 
> 
> r~

>   * lower-subreg.c (simplify_subreg_concatn): Reject paradoxical subregs.

Ok, thanks.

Jakub


[PATCH, lower-subreg] Fix 70355

2016-03-28 Thread Richard Henderson
The ICE comes fixing up a debug_insn, and the debug info incoming to the pass 
seems reasonable.  Just recognizing that this situation is possible and not 
asserting appears to work.


Ok?


r~
* lower-subreg.c (simplify_subreg_concatn): Reject paradoxical subregs.


diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index 5432d05..f7b3ac4 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -614,7 +614,8 @@ simplify_subreg_concatn (machine_mode outermode, rtx op,
 
   innermode = GET_MODE (op);
   gcc_assert (byte < GET_MODE_SIZE (innermode));
-  gcc_assert (GET_MODE_SIZE (outermode) <= GET_MODE_SIZE (innermode));
+  if (GET_MODE_SIZE (outermode) > GET_MODE_SIZE (innermode))
+return NULL_RTX;
 
   inner_size = GET_MODE_SIZE (innermode) / XVECLEN (op, 0);
   part = XVECEXP (op, 0, byte / inner_size);
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr70355.c 
b/gcc/testsuite/gcc.c-torture/compile/pr70355.c
new file mode 100644
index 000..4749427
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr70355.c
@@ -0,0 +1,14 @@
+/* { dg-require-effective-target int128 } */
+/* { dg-additional-options "-g" } */
+
+typedef unsigned __int128 v2ti __attribute__ ((vector_size (32)));
+
+unsigned
+foo (unsigned i, v2ti v)
+{
+  do {
+i--;
+v %= ~v;
+  } while (i);
+  return v[0] + v[1];
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr70355.c 
b/gcc/testsuite/gcc.target/i386/pr70355.c
new file mode 100644
index 000..b55f6fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr70355.c
@@ -0,0 +1,14 @@
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-O2 -fno-tree-ter -funroll-loops -mavx512f -g" } */
+
+typedef unsigned __int128 v2ti __attribute__ ((vector_size (32)));
+
+unsigned
+foo (unsigned i, v2ti v)
+{
+  do {
+i--;
+v %= ~v;
+  } while (i);
+  return v[0] + v[1];
+}


Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-28 Thread Mike Stump
On Mar 28, 2016, at 5:38 PM, Bill Schmidt  wrote:
> For a long time we've had hundreds of failing guality tests.  These
> failures don't seem to have any correlation with gdb functionality for
> POWER, which is working fine.

> Verified to remove hundreds of failure messages on
> powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?

So, I think this should be up to the target maintainer and/or the quality 
people, but I think it is reasonable to do this.  If all or most target 
maintainers do this, then we flip to not run, unless the target triplet asks 
for them.

On my (random) target, I don’t fail any quality tests:

newlib/newlib/libc/posix/popen.c:138: undefined reference to `vfork’
newlib/newlib/libc/posix/popen.c:218: undefined reference to `waitpid’

so they all turn off.  I have a fork (it forks my simulator) and a wait, but no 
vfork or waitpid.  This might explain why some people don’t trip on the issue.

Re: [PATCH ARM v2] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg

2016-03-28 Thread Kugan


On 25/03/16 08:02, Charles Baylis wrote:
> When compiling with -mlong-calls and -pg, calls to the __gnu_mcount_nc
> function are not generated as long calls.
> 
> This is the sequel to this patch
> https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00881.html
> 
> This patch fixes the following problems with the previous patch.
> . Nested functions now work (thanks to Richard E for spotting this)
> . Thumb-1 now works
> 
> This patch works by adding new patterns (one for ARM/Thumb-2 and one
> for Thumb-1) which are placed in the prologue as a placeholder for
> some RTL which describes the address. This is either a SYMBOL_REF for
> targets with MOVW/MOVT, or a literal pool reference for other targets.
> The implementation of ARM_FUNCTION_PROFILER is changed to search for
> this insn so that the the address of the __gnu_mcount_nc function can
> be loaded using an appropriate sequence for the target.
> 
> I also tried generating the profiling call sequence directly in the
> prologue, but this requires some unpleasant hacks to prevent spurious
> register pushes from ASM_OUTPUT_REG_PUSH.
> 
> Tested with no new regressions on arm-unknown-linux-gnueabihf on QEMU.
> The generated code sequences have been inspected for normal and nested
> functions on ARM v6, ARM v7, Thumb-1, and Thumb-2 targets.
> 
> This does not fix a regression, so I don't expect to apply it for
> GCC6, is it OK for when stage 1 re-opens.
> 

Hi Charles,

+static void
+arm_emit_long_call_profile_insn ()
+{
+  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
+  /* if movt/movw are not available, use a constant pool */
+  if (!arm_arch_thumb2)

Should this be !TARGET_USE_MOVT?

Thanks,
Kugan


[patch commited FT32] Add -nodiv option

2016-03-28 Thread James Bowman
The attached patches add an "-mnodiv" option to FT32, preventing use of the 
divide and modulo instructions. This required additional software div/mod 
implemtations in FT32's libgcc.

2016-03-28  James Bowman  

* config/ft32/ft32.opt (mnodiv): New.
* config/ft32/ft32.md (*divsi3, *modsi3): Qualify with
TARGET_NODIV.
* doc/invoke.texi (FT32 Options -mnodiv): New.

2016-03-28  James Bowman  

* libgcc/config/ft32/lib1funcs.S (*divsi3, *modsi3): New.

Index: gcc/config/ft32/ft32.md
===
--- gcc/config/ft32/ft32.md (revision 234405)
+++ gcc/config/ft32/ft32.md (working copy)
@@ -101,7 +101,7 @@
   (div:SI
(match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "ft32_rimm_operand" "r,KA")))]
-  ""
+  "!TARGET_NODIV"
   "div.l  %0,%1,%2")
 
 (define_insn "modsi3"
@@ -109,7 +109,7 @@
   (mod:SI
(match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "ft32_rimm_operand" "r,KA")))]
-  ""
+  "!TARGET_NODIV"
   "mod.l  %0,%1,%2")
 
 (define_insn "udivsi3"
@@ -117,7 +117,7 @@
   (udiv:SI
(match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "ft32_rimm_operand" "r,KA")))]
-  ""
+  "!TARGET_NODIV"
   "udiv.l %0,%1,%2")
 
 (define_insn "umodsi3"
@@ -125,7 +125,7 @@
   (umod:SI
(match_operand:SI 1 "register_operand" "r,r")
(match_operand:SI 2 "register_operand" "r,KA")))]
-  ""
+  "!TARGET_NODIV"
   "umod.l %0,%1,%2")
 
 (define_insn "extvsi"
Index: gcc/config/ft32/ft32.opt
===
--- gcc/config/ft32/ft32.opt(revision 234405)
+++ gcc/config/ft32/ft32.opt(working copy)
@@ -25,3 +25,7 @@ target the software simulator.
 mlra
 Target Report Var(ft32_lra_flag) Init(0) Save
 Use LRA instead of reload.
+
+mnodiv
+Target Report Mask(NODIV)
+Avoid use of the DIV and MOD instructions

Index: libgcc/config/ft32/lib1funcs.S
===
--- libgcc/config/ft32/lib1funcs.S  (revision 234405)
+++ libgcc/config/ft32/lib1funcs.S  (working copy)
@@ -25,8 +25,8 @@ see the files COPYING3 and COPYING.RUNTI
 # for implementation details of all except division which is detailed below
 #
 
+#ifdef L_fp_tools
 // .global __cmpsf2_
-
 nan:.long 0x7FFF# also abs mask
 inf:.long 0x7F80
 sign_mask:  .long 0x8000
@@ -37,6 +37,14 @@ smallest_norm:  .long 0x0080# im
 high_FF:.long 0xFF00
 high_uint:  .long 0x
 
+ntz_table:
+.byte   32,0,1,12,2,6,0,13,3,0,7,0,0,0,0,14
+.byte   10,4,0,0,8,0,0,25,0,0,0,0,0,21,27,15
+.byte   31,11,5,0,0,0,0,0,9,0,0,24,0,0,20,26
+.byte   30,0,0,0,0,23,0,19,29,0,22,18,28,17,16,0
+
+#endif
+
 # Supply a few 'missing' instructions
 
 # not
@@ -87,12 +95,6 @@ high_uint:  .long 0x
 lpmi.b  \x, \x, 0
 .endm
 
-ntz_table:
-.byte   32,0,1,12,2,6,0,13,3,0,7,0,0,0,0,14
-.byte   10,4,0,0,8,0,0,25,0,0,0,0,0,21,27,15
-.byte   31,11,5,0,0,0,0,0,9,0,0,24,0,0,20,26
-.byte   30,0,0,0,0,23,0,19,29,0,22,18,28,17,16,0
-
 # calculate leading zero count
 .macro  nlz x, scr
 flip\x, \x, 31
@@ -503,6 +505,9 @@ mul_z0:
 ## for implementation details
 
 
+
+
+#ifdef  L_divsf3
 dc_1: .long 0xe7d7
 dc_2: .long 0xffe8
 dc_3: .long 0xffbad86f
@@ -517,9 +522,6 @@ dc_11: .long0x0452b1bf
 dc_12: .long0xFFC0
 spec_val_test:  .long   0x7F7F
 
-
-
-#ifdef  L_divsf3
 .global __divsf3
 __divsf3:
 push$r13
@@ -869,6 +871,7 @@ float_not_zero2:
 return
 #endif
 
+#if 0
 ##
 ##
 ## float compare
@@ -913,7 +916,74 @@ cmp_is_gt:
 cmp_is_eq:
 ldk $r0, 0
 return
+#endif
 
+#ifdef  L_udivsi3
+.global __udivsi3
+__udivsi3:
+   # $r0 is dividend
+   # $r1 is divisor
+   ldk $r2,0
+   push$r28
+   ldk $r28,-32
+0:
+   lshr$r3,$r0,31  # Shift $r2:$r0 left one
+   ashl$r0,$r0,1
+   ashl$r2,$r2,1
+   or  $r2,$r2,$r3
+   cmp $r2,$r1
+   jmpcb,1f
+2:
+   sub $r2,$r2,$r1
+   add $r0,$r0,1
+1:
+   add $r28,$r28,1
+   jmpx31,$r28,1,0b
+   pop $r28
+   # $r0: quotient
+   # $r2: remainder
+   return
+#endif
 
+#ifdef L_umodsi3
+.global__umodsi3
+__umodsi3:
+   call__udivsi3
+   move$r0,$r2
+   return
+#endif
 
+#ifdef L_divsi3
+.global__divsi3
+__divsi3:
+   xor $r5,$r0,$r1 # $r5 is sign of result
+   ashr$r2,$r0,31  # $r0 = abs($r0)
+   xor $r0,$r0,$r2
+   sub $r0,$r0,$r2
+   ashr$r2,$r1,31  

[PATCH] Disable guality tests for powerpc*-linux*

2016-03-28 Thread Bill Schmidt
Hi,

For a long time we've had hundreds of failing guality tests.  These
failures don't seem to have any correlation with gdb functionality for
POWER, which is working fine.  At this point the value of these tests to
us seems questionable.  Fixing these is such low priority that it is
unlikely we will ever get around to it.  In the meanwhile, the failures
simply clutter up our regression test reports.  Thus I'd like to disable
them, and that's what this test does.

Verified to remove hundreds of failure messages on
powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?

Thanks,
Bill


2016-03-28  Bill Schmidt  

* g++.dg/guality/guality.exp: Disable for powerpc*-linux*.
* gcc.dg/guality/guality.exp: Likewise.


Index: gcc/testsuite/g++.dg/guality/guality.exp
===
--- gcc/testsuite/g++.dg/guality/guality.exp(revision 234476)
+++ gcc/testsuite/g++.dg/guality/guality.exp(working copy)
@@ -13,6 +13,11 @@ if { [istarget "powerpc-ibm-aix*"] } {
 return
 }
 
+if { [istarget "powerpc*-linux*"] } {
+set torture_execute_xfail "powerpc*-linux*"
+return
+}
+
 proc check_guality {args} {
 # Don't count check_guality as PASS, or FAIL etc., that would make
 # the total PASS count dependent on how many parallel runtest invocations
Index: gcc/testsuite/gcc.dg/guality/guality.exp
===
--- gcc/testsuite/gcc.dg/guality/guality.exp(revision 234476)
+++ gcc/testsuite/gcc.dg/guality/guality.exp(working copy)
@@ -13,6 +13,11 @@ if { [istarget "powerpc-ibm-aix*"] } {
 return
 }
 
+if { [istarget "powerpc*-linux*"] } {
+set torture_execute_xfail "powerpc*-linux*"
+return
+}
+
 proc check_guality {args} {
 # Don't count check_guality as PASS, or FAIL etc., that would make
 # the total PASS count dependent on how many parallel runtest invocations




Patches to fix optimizer bug & C++ exceptions for GCC VAX backend

2016-03-28 Thread Jake Hamby
Amazingly enough, my patch worked well enough that my NetBSD VAX kernel built 
with GCC 5.3 is no longer crashing. I feel pretty good about what I have so far 
so here's the complete diff for both the C++ exception fix and the bad 
condition codes optimizer bug. It should be good enough to check into NetBSD 
current, at least, and I believe it does fix most, if not all, of the bad code 
generation bugs with optimization on VAX.

-Jake

Index: gcc/except.c
===
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/except.c,v
retrieving revision 1.3
diff -u -r1.3 except.c
--- gcc/except.c23 Mar 2016 15:51:36 -  1.3
+++ gcc/except.c28 Mar 2016 23:24:40 -
@@ -2288,7 +2288,8 @@
 #endif
 {
 #ifdef EH_RETURN_HANDLER_RTX
-  emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+  rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+  RTX_FRAME_RELATED_P (insn) = 1;  // XXX FIXME in VAX backend?
 #else
   error ("__builtin_eh_return not supported on this target");
 #endif
Index: gcc/config/vax/builtins.md
===
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/builtins.md,v
retrieving revision 1.5
diff -u -r1.5 builtins.md
--- gcc/config/vax/builtins.md  24 Jan 2016 09:43:34 -  1.5
+++ gcc/config/vax/builtins.md  28 Mar 2016 23:24:40 -
@@ -50,7 +50,8 @@
(ctz:SI (match_operand:SI 1 "general_operand" "nrmT")))
(set (cc0) (match_dup 0))]
   ""
-  "ffs $0,$32,%1,%0")
+  "ffs $0,$32,%1,%0"
+  [(set_attr "cc" "clobber")])
 
 (define_expand "sync_lock_test_and_set"
   [(set (match_operand:VAXint 0 "nonimmediate_operand" "=&g")
@@ -88,7 +89,8 @@
   (match_dup 1))
  (const_int 1))])]
   ""
-  "jbssi %1,%0,%l2")
+  "jbssi %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_insn "jbbssihi"
   [(parallel
@@ -105,7 +107,8 @@
   (match_dup 1))
  (const_int 1))])]
   ""
-  "jbssi %1,%0,%l2")
+  "jbssi %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_insn "jbbssisi"
   [(parallel
@@ -122,8 +125,8 @@
   (match_dup 1))
  (const_int 1))])]
   ""
-  "jbssi %1,%0,%l2")
-
+  "jbssi %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_expand "sync_lock_release"
   [(set (match_operand:VAXint 0 "memory_operand" "+m")
@@ -160,7 +163,8 @@
   (match_dup 1))
  (const_int 0))])]
   ""
-  "jbcci %1,%0,%l2")
+  "jbcci %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_insn "jbbccihi"
   [(parallel
@@ -177,7 +181,8 @@
   (match_dup 1))
  (const_int 0))])]
   ""
-  "jbcci %1,%0,%l2")
+  "jbcci %1,%0,%l2"
+  [(set_attr "cc" "none")])
 
 (define_insn "jbbccisi"
   [(parallel
@@ -194,4 +199,5 @@
   (match_dup 1))
  (const_int 0))])]
   ""
-  "jbcci %1,%0,%l2")
+  "jbcci %1,%0,%l2"
+  [(set_attr "cc" "none")])
Index: gcc/config/vax/elf.h
===
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/elf.h,v
retrieving revision 1.6
diff -u -r1.6 elf.h
--- gcc/config/vax/elf.h23 Mar 2016 15:51:37 -  1.6
+++ gcc/config/vax/elf.h28 Mar 2016 23:24:40 -
@@ -26,7 +26,7 @@
 #define REGISTER_PREFIX "%"
 #define REGISTER_NAMES \
   { "%r0", "%r1",  "%r2",  "%r3", "%r4", "%r5", "%r6", "%r7", \
-"%r8", "%r9", "%r10", "%r11", "%ap", "%fp", "%sp", "%pc", }
+"%r8", "%r9", "%r10", "%r11", "%ap", "%fp", "%sp", "%pc", "%psw", }
 
 #undef SIZE_TYPE
 #define SIZE_TYPE "long unsigned int"
@@ -45,18 +45,8 @@
count pushed by the CALLS and before the start of the saved registers.  */
 #define INCOMING_FRAME_SP_OFFSET 0
 
-/* Offset from the frame pointer register value to the top of the stack.  */
-#define FRAME_POINTER_CFA_OFFSET(FNDECL) 0
-
-/* We use R2-R5 (call-clobbered) registers for exceptions.  */
-#define EH_RETURN_DATA_REGNO(N) ((N) < 4 ? (N) + 2 : INVALID_REGNUM)
-
-/* Place the top of the stack for the DWARF2 EH stackadj value.  */
-#define EH_RETURN_STACKADJ_RTX \
-  gen_rtx_MEM (SImode, \
-  plus_constant (Pmode,\
- gen_rtx_REG (Pmode, FRAME_POINTER_REGNUM),\
- -4))
+/* We use R2-R3 (call-clobbered) registers for exceptions.  */
+#define EH_RETURN_DATA_REGNO(N) ((N) < 2 ? (N) + 2 : INVALID_REGNUM)
 
 /* Simple store the return handler into the call frame.  */
 #define EH_RETURN_HANDLER_RTX  \
@@ -66,10 +56,6 @@
  16))
 
 
-/* Reserve the top of the stack for exception handler stackadj value.  */
-#undef STARTING_FRAME_OFFSET
-#define STARTING_FRAME_OFFSET -4
-
 /* The VAX wants no space

Re: [ARM] Add support for overflow add, sub, and neg operations

2016-03-28 Thread Michael Collison
An updated patch that resolves the issues with thumb2 support and adds 
test cases as requested. Looking to check this into GCC 7 stage1 when it 
opens.


2016-02-24  Michael Collison  

* config/arm/arm-modes.def: Add new condition code mode CC_V
to represent the overflow bit.
* config/arm/arm.c (maybe_get_arm_condition_code):
Add support for CC_Vmode.
* config/arm/arm.md (addv4, add3_compareV,
addsi3_compareV_upper): New patterns to support signed
builtin overflow add operations.
(uaddv4, add3_compareC, addsi3_compareV_upper):
New patterns to support unsigned builtin add overflow operations.
(subv4, sub3_compare1): New patterns to support signed
builtin overflow subtract operations,
(usubv4): New patterns to support unsigned builtin subtract
overflow operations.
(negvsi3, negvdi3, negdi2_compare, negsi2_carryin_compare): New 
patterns

to support builtin overflow negate operations.
* gcc.target/arm/builtin_saddl.c: New testcase.
* gcc.target/arm/builtin_saddll.c: New testcase.
* gcc.target/arm/builtin_uaddl.c: New testcase.
* gcc.target/arm/builtin_uaddll.c: New testcase.
* gcc.target/arm/builtin_ssubl.c: New testcase.
* gcc.target/arm/builtin_ssubll.c: New testcase.
* gcc.target/arm/builtin_usubl.c: New testcase.
* gcc.target/arm/builtin_usubll.c: New testcase.

On 02/29/2016 04:13 AM, Kyrill Tkachov wrote:


On 26/02/16 10:32, Michael Collison wrote:



On 02/25/2016 02:51 AM, Kyrill Tkachov wrote:

Hi Michael,

On 24/02/16 23:02, Michael Collison wrote:
This patch adds support for builtin overflow of add, subtract and 
negate. This patch is targeted for gcc 7 stage 1. It was tested 
with no regressions in arm and thumb modes on the following targets:


arm-non-linux-gnueabi
arm-non-linux-gnuabihf
armeb-none-linux-gnuabihf
arm-non-eabi



I'll have a deeper look once we're closer to GCC 7 development.
I've got a few comments in the meantime.


2016-02-24 Michael Collison 

* config/arm/arm-modes.def: Add new condition code mode CC_V
to represent the overflow bit.
* config/arm/arm.c (maybe_get_arm_condition_code):
Add support for CC_Vmode.
* config/arm/arm.md (addv4, add3_compareV,
addsi3_compareV_upper): New patterns to support signed
builtin overflow add operations.
(uaddv4, add3_compareC, addsi3_compareV_upper):
New patterns to support unsigned builtin add overflow operations.
(subv4, sub3_compare1): New patterns to support signed
builtin overflow subtract operations,
(usubv4): New patterns to support unsigned builtin subtract
overflow operations.
(negvsi3, negvdi3, negdi2_compre, negsi2_carryin_compare): New 
patterns

to support builtin overflow negate operations.




Can you please summarise what sequences are generated for these 
operations, and how

they are better than the default fallback sequences.


Sure for a simple test case such as:

int
fn3 (int x, int y, int *ovf)
{
  int res;
  *ovf = __builtin_sadd_overflow (x, y, &res);
  return res;
}

Current trunk at -O2 generates

fn3:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
cmp r1, #0
mov r3, #0
add r1, r0, r1
blt .L4
cmp r1, r0
blt .L3
.L2:
str r3, [r2]
mov r0, r1
bx  lr
.L4:
cmp r1, r0
ble .L2
.L3:
mov r3, #1
b   .L2

With the overflow patch this now generates:

   addsr0, r0, r1
   movvs   r3, #1
   movvc   r3, #0
   str r3, [r2]
   bx  lr



Thanks! That looks much better.

Also, we'd need tests for each of these overflow operations, since 
these are pretty complex

patterns that are being added.


The patterns are tested now most notably by tests in:

c-c++-common/torture/builtin-arith-overflow*.c

I had a few failures I resolved so the builtin overflow arithmetic 
functions are definitely being exercised.


Great, that gives me more confidence on the correctness aspects but...



Also, you may want to consider splitting this into a patch series, 
each adding a single
overflow operation, together with its tests. That way it will be 
easier to keep track of
which pattern applies to which use case and they can go in 
independently of each other.


Let me know if you still fell the same way given the existing test 
cases.




... I'd like us to still have scan-assembler tests. The torture tests 
exercise the correctness,
but we'd want tests to catch regressions where we stop generating the 
new patterns due to other

optimisation changes, which would lead to code quality regressions.
So I'd like us to have scan-assembler tests for these sequences to 
make sure we generate the right

instructions.

Thanks,
Kyrill



+(define_expand "uaddv4"
+  [(match_operand:SIDI 0 "register_operand")
+   (match_operand:SIDI 1 "regi

Bad CC0 optimizer in VAX backend (was Re: Patches to fix GCC’s C++ exception handling on NetBSD/VAX)

2016-03-28 Thread Jake Hamby
I have some bad news and some good news. The bad news is that there has been a 
nasty optimizer bug lurking in the VAX backend for GCC for many years, which 
has to do with over-optimistic removal of necessary tst/cmp instructions under 
certain circumstances. This manifests at -O or higher and the symptoms are 
strange behavior like stack overflows because of branches going the wrong way.

The good news is that my suspicions about the CC0 handler appear to be correct, 
and better yet, I'm currently testing a patch that isn't too big and I believe 
will fix this bug. The bad behavior is in vax_notice_update_cc (), which is 
supposed to call CC_STATUS_INIT if the CC flags have been reset, or set 
cc_status.value1 and possibly cc_status.value2 with the destination of the 
current (set...) instruction, whose signed comparison to 0 will be stored in 
the N and Z flags as a side effect (most VAX instructions do this).

The first bug is that most, but not all of the define_insn patterns in vax.md 
actually do this. The paths that emit movz* instructions will not set N and Z, 
and there are some code paths that can't be trusted because they handle a 
64-bit integer in two 32-bit pieces, for example, so N and Z won't reflect the 
desired result (comparison of operand 0 to 0).

The current version of vax_notice_update_cc has a specific check for this: it 
won't set the cc_status cache if GET_CODE (SET_DEST (exp)) != ZERO_EXTRACT. The 
problem is that this is checking the RTL expression for (zero_extract...) and 
not whether what was actually emitted is a movz* or not. That's why all of the 
define_insn()'s have to be annotated with their actual behavior vis-a-vis 
compare to 0, and then the change to vax.c to read the CC attribute makes it 
really much easier to get the correct behavior.

I need to do a lot of testing today to see if this really does help fix GCC's 
VAX backend, but I'm hopeful that this is at least a real bug that needs to be 
fixed, and that I have a fix for it that should work. The other good news is 
that you only need three cc enums: "none" (doesn't alter Z or N flags), 
"compare" (Z and N reflect comparison of operand 0 to literal 0), and "clobber" 
(call CC_STATUS_INIT to reset cc_status cache).

One thing to note is that the current version of vax.c only sets CC_NO_OVERFLOW 
on certain paths, while it should actually always set CC_NO_OVERFLOW so as to 
avoid emitting any unsigned branch instructions that would use an incorrect 
value of the C flag (which most VAX instructions do NOT set). The code that 
handles CC_NO_OVERFLOW is in final.c, and what it does is change signed 
comparisons with 0 into unsigned ones that do the same thing. (a < 0) is always 
false (for unsigned ints), (a >= 0) is always true, (a <= 0) turns into (a == 
0), and (a > 0) turns into (a != 0). Here's the patch to vax.c. I'll send the 
rest after I've tested that it does what I think it should do. The diff to 
vax.md will be larger, but it's mostly adding the "cc" attributes and not code.

Index: gcc/config/vax/vax.c
===
RCS file: /cvsroot/src/external/gpl3/gcc/dist/gcc/config/vax/vax.c,v
retrieving revision 1.15
diff -u -r1.15 vax.c
--- gcc/config/vax/vax.c24 Mar 2016 04:27:29 -  1.15
+++ gcc/config/vax/vax.c28 Mar 2016 22:28:10 -
@@ -1131,61 +1136,51 @@
 /* Worker function for NOTICE_UPDATE_CC.  */
 
 void
-vax_notice_update_cc (rtx exp, rtx insn ATTRIBUTE_UNUSED)
+vax_notice_update_cc (rtx exp, rtx_insn *insn)
 {
+  attr_cc cc = get_attr_cc (insn);
+
   if (GET_CODE (exp) == SET)
 {
   if (GET_CODE (SET_SRC (exp)) == CALL)
CC_STATUS_INIT;
-  else if (GET_CODE (SET_DEST (exp)) != ZERO_EXTRACT
-  && GET_CODE (SET_DEST (exp)) != PC)
+  else if (GET_CODE (SET_DEST (exp)) != PC
+  && cc == CC_COMPARE)
{
- cc_status.flags = 0;
- /* The integer operations below don't set carry or
+ /* Some of the integer operations don't set carry or
 set it in an incompatible way.  That's ok though
 as the Z bit is all we need when doing unsigned
 comparisons on the result of these insns (since
 they're always with 0).  Set CC_NO_OVERFLOW to
 generate the correct unsigned branches.  */
- switch (GET_CODE (SET_SRC (exp)))
-   {
-   case NEG:
- if (GET_MODE_CLASS (GET_MODE (exp)) == MODE_FLOAT)
-   break;
-   case AND:
-   case IOR:
-   case XOR:
-   case NOT:
-   case CTZ:
-   case MEM:
-   case REG:
- cc_status.flags = CC_NO_OVERFLOW;
- break;
-   default:
- break;
-   }
+ cc_status.flags = CC_NO_OVERFLOW;
  cc_status.value1 = SET_DEST (exp);
  cc_status.value2 = SET_SRC (exp);
}
+  else if (cc != CC_NONE)
+   CC_STA

Re: [PATCH] c++/67376 Comparison with pointer to past-the-end, of array fails inside constant expression

2016-03-28 Thread Martin Sebor

I think let's defer the fix for c++/60760 (i.e. the nullptr_p bits)
until stage 1, when it can be combined with the POINTER_PLUS_EXPR fix,
and put the rest of this patch in now.


I can split up the patch into two and post the subset without
the fix for c++/60760, though I don't expect to be done with
it after I get back (next week).

I'd like to understand your concern with the fix for c++/60760.
Is it that it's incomplete (doesn't reject taking the address
of the first member of a struct, as in &null->first_member),
or are you worried that the changes may not be stable enough?


More the latter; it seems like significant new code and doesn't fix a
regression.


Attached is an updated patch without the fix for c++/60760, retested
on x86_64.

Martin
PR c++/67376 - [5/6 regression] Comparison with pointer to past-the-end
	of array fails inside constant expression
PR c++/70170 - [6 regression] bogus not a constant expression error comparing
	pointer to array to null
PR c++/70172 - incorrect reinterpret_cast from integer to pointer error
	on invalid constexpr initialization
PR c++/70228 - insufficient detail in diagnostics for a constexpr out of bounds
	array subscript

gcc/testsuite/ChangeLog:
2016-03-18  Martin Sebor  

	PR c++/67376
	PR c++/70170
	PR c++/70172
	PR c++/70228
	* g++.dg/cpp0x/constexpr-array-ptr10.C: New test.
	* g++.dg/cpp0x/constexpr-array-ptr9.C: New test.
	* g++.dg/cpp0x/constexpr-array5.C: Adjust text of expected diagnostic.
	* g++.dg/cpp0x/constexpr-string.C: Same.
	* g++.dg/cpp0x/constexpr-wstring2.C: Same.
	* g++.dg/cpp0x/pr65398.C: Same.
	* g++.dg/ext/constexpr-vla1.C: Same.
	* g++.dg/ext/constexpr-vla2.C: Same.
	* g++.dg/ext/constexpr-vla3.C: Same.
	* g++.dg/ubsan/pr63956.C: Same.

gcc/cp/ChangeLog:
2016-03-18  Martin Sebor  

	PR c++/67376
	PR c++/70170
	PR c++/70172
	PR c++/70228
	* constexpr.c (diag_array_subscript): New function.
	(cxx_eval_array_reference): Detect out of bounds array indices.

gcc/ChangeLog:
2016-03-18  Martin Sebor  

	PR c++/67376
	* fold-const.c (maybe_nonzero_address): New function.
	(fold_comparison): Call it.  Fold equality and relational
	expressions involving null pointers.
	(tree_single_nonzero_warnv_p): Call maybe_nonzero_address.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 7776cac..c900080 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1837,6 +1837,30 @@ find_array_ctor_elt (tree ary, tree dindex, bool insert = false)
   return -1;
 }
 
+/* Under the control of CTX, issue a detailed diagnostic for
+   an out-of-bounds subscript INDEX into the expression ARRAY.  */
+
+static void
+diag_array_subscript (const constexpr_ctx *ctx, tree array, tree index)
+{
+  if (!ctx->quiet)
+{
+  tree arraytype = TREE_TYPE (array);
+
+  /* Convert the unsigned array subscript to a signed integer to avoid
+	 printing huge numbers for small negative values.  */
+  tree sidx = fold_convert (ssizetype, index);
+  if (DECL_P (array))
+	{
+	  error ("array subscript value %qE is outside the bounds "
+		 "of array %qD of type %qT", sidx, array, arraytype);
+	  inform (DECL_SOURCE_LOCATION (array), "declared here");
+	}
+  else
+	error ("array subscript value %qE is outside the bounds "
+	   "of array type %qT", sidx, arraytype);
+}
+}
 
 /* Subroutine of cxx_eval_constant_expression.
Attempt to reduce a reference to an array slot.  */
@@ -1861,6 +1885,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
 	false,
 	non_constant_p, overflow_p);
   VERIFY_CONSTANT (index);
+
   if (lval && ary == oldary && index == oldidx)
 return t;
   else if (lval)
@@ -1885,8 +1910,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
   if (!tree_fits_shwi_p (index)
   || (i = tree_to_shwi (index)) < 0)
 {
-  if (!ctx->quiet)
-	error ("negative array subscript");
+  diag_array_subscript (ctx, ary, index);
   *non_constant_p = true;
   return t;
 }
@@ -1898,8 +1922,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
   VERIFY_CONSTANT (nelts);
   if (!tree_int_cst_lt (index, nelts))
 {
-  if (!ctx->quiet)
-	error ("array subscript out of bound");
+  diag_array_subscript (ctx, ary, index);
   *non_constant_p = true;
   return t;
 }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 44fe2a2..3f65243 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -8336,6 +8336,20 @@ pointer_may_wrap_p (tree base, tree offset, HOST_WIDE_INT bitpos)
   return total.to_uhwi () > (unsigned HOST_WIDE_INT) size;
 }
 
+/* Return a positive integer when the symbol DECL is known to have
+   a nonzero address, zero when it's known not to (e.g., it's a weak
+   symbol), and a negative integer when the symbol is not yet in the
+   symbol table and so whether or not its address is zero is unknown.  */
+static int
+maybe_nonzero_address (tree decl)
+{
+  if (DECL_P (decl) && decl_in_symtab_p (decl))
+if (struct symtab_node *symbol = symtab_node

RFA: PATCH to tree-inline.c:remap_decls for c++/70353 (ICE with __func__ and constexpr)

2016-03-28 Thread Jason Merrill
The constexpr evaluation code uses the inlining code to remap the 
constexpr function body for evaluation so that recursion works properly. 
 In this testcase __func__ is declared as a static local variable, so 
rather than remap it, remap_decls tries to add it to the local_decls 
list for the function we're inlining into.  But there is no such 
function in this case, so we crash.


Avoid the add_local_decl call when cfun is null avoids the ICE (thanks 
Jakub), but results in an undefined symbol.  Calling 
varpool_node::finalize_decl instead allows cgraph to handle the 
reference from 'c' properly.


OK if testing passes?
commit d875b432434be92ab345c3851fa3ba4bef738399
Author: Jason Merrill 
Date:   Mon Mar 28 17:14:43 2016 -0400

	PR c++/70353
	* tree-inline.c (remap_decls): Use varpool_node::finalize_decl if
	cfun is null.

diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C
new file mode 100644
index 000..e678290
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C
@@ -0,0 +1,13 @@
+// PR c++/70353
+// { dg-do link { target c++11 } }
+
+constexpr const char* ce ()
+{
+  return __func__;
+}
+
+const char *c = ce();
+
+int main()
+{
+}
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 9d4f8f7..74b0ca9 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -614,10 +614,16 @@ remap_decls (tree decls, vec **nonlocalized_list,
   if (can_be_nonlocal (old_var, id))
 	{
 	  /* We need to add this variable to the local decls as otherwise
-	 nothing else will do so.  */
+	 nothing else will do so.  If we're not in a function,
+	 tell cgraph about it.  */
 	  if (TREE_CODE (old_var) == VAR_DECL
 	  && ! DECL_EXTERNAL (old_var))
-	add_local_decl (cfun, old_var);
+	{
+	  if (cfun)
+		add_local_decl (cfun, old_var);
+	  else
+		varpool_node::finalize_decl (old_var);
+	}
 	  if ((!optimize || debug_info_level > DINFO_LEVEL_TERSE)
 	  && !DECL_IGNORED_P (old_var)
 	  && nonlocalized_list)


Fwd: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible

2016-03-28 Thread Bin.Cheng
Sorry, Should have replied to gcc-patches list.

Thanks,
bin

-- Forwarded message --
From: "Bin.Cheng" 
Date: Tue, 29 Mar 2016 03:55:04 +0800
Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
DR against its innermost loop bahavior if possible
To: Richard Biener 

On 3/17/16, Richard Biener  wrote:
> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng  wrote:
>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>  wrote:
>>>
>>> Hmm.
>> Hi,
>> Thanks for reviewing.
>>>
>>> +  equal_p = true;
>>> +  if (e1->base_address && e2->base_address)
>>> +equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>> +  if (e1->offset && e2->offset)
>>> +equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>
>>> surely better to return false early.
>>>
>>> I think we don't want this in tree-data-refs.h also because of ...
>>>
>>> @@ -615,15 +619,29 @@
>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>> (data_reference_p a)
>>>data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>> creating the DR (or adjust the equality function
>> and hashing
>>>tree ref = DR_REF (a);
>>>tree base_ref = DR_BASE_OBJECT (a);
>>> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>>tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>bool exist1, exist2;
>>>
>>> -  while (TREE_CODE (ref) == COMPONENT_REF
>>> -|| TREE_CODE (ref) == IMAGPART_EXPR
>>> -|| TREE_CODE (ref) == REALPART_EXPR)
>>> -ref = TREE_OPERAND (ref, 0);
>>> +  /* If reference in DR has innermost loop behavior and it is not
>>> + a compound memory reference, we store it to innermost_DR_map,
>>> + otherwise to ref_DR_map.  */
>>> +  if (TREE_CODE (ref) == COMPONENT_REF
>>> +  || TREE_CODE (ref) == IMAGPART_EXPR
>>> +  || TREE_CODE (ref) == REALPART_EXPR
>>> +  || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>> +  || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>> +{
>>> +  while (TREE_CODE (ref) == COMPONENT_REF
>>> +|| TREE_CODE (ref) == IMAGPART_EXPR
>>> +|| TREE_CODE (ref) == REALPART_EXPR)
>>> +   ref = TREE_OPERAND (ref, 0);
>>> +
>>> +  master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>>> +}
>>> +  else
>>> +master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>>
>>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
>>> need to
>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>> and REALPART) before creating the DR (or adjust the equality function
>>> and hashing
>>> to disregard them which means subtracting their offset from DR_INIT.
>> I am not sure if I understand correctly.  But for component reference,
>> it is the base object that we want to record/track.  For example,
>>
>>   for (i = 0; i < N; i++) {
>> m = *data++;
>>
>> m1 = p1->x - m;
>> m2 = p2->x + m;
>>
>> p3->y = (m1 >= m2) ? p1->y : p2->y;
>>
>> p1++;
>> p2++;
>> p3++;
>>   }
>> We want to infer that reads of p1/p2 in condition statement won't trap
>> because there are unconditional reads of the structures, though the
>> unconditional reads are actual of other sub-objects.  Here it is the
>> invariant part of address that we want to track.
>
> Well, the variant parts - we want to strip invariant parts as far as we can
> (offsetof (x) and offsetof (y))
>
>> Also illustrated by this example, we can't rely on data-ref analyzer
>> here.  Because in gathering/scattering cases, the address could be not
>> affine at all.
>
> Sure, but that's a different issue.
>
>>>
>>> To adjust the references we collect you'd maybe could use a callback
>>> to get_references_in_stmt
>>> to adjust them.
>>>
>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>> as
>> Is this a part of the method you suggested above, or is it an
>> alternative one?  If it's the latter, then I have below questions
>> embedded.
>
> It is an alternative to adding a hook to get_references_in_stmt and
> probably "easier".
>
>>>
>>> Index: tree-if-conv.c
>>> ===
>>> --- tree-if-conv.c  (revision 234215)
>>> +++ tree-if-conv.c  (working copy)
>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>
>>>for (i = 0; refs->iterate (i, &dr); i++)
>>>  {
>>> +  tree *refp = &DR_REF (dr);
>>> +  while ((TREE_CODE (*refp) == COMPONENT_REF
>>> + && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>> +|| TREE_CODE (*refp) == IMAGPART_EXPR
>>> +|| TREE_CODE (*refp) == REALPART_EXPR)
>>> +   refp = &TREE_OPERAND (*refp, 0);
>>> +  if (refp != &DR_REF (dr))
>>> +   {
>>> + tree saved_base = *refp;
>>> + *refp = integer_zero_node;
>>> +
>>> + if (DR_INIT (dr))
>>> +   {
>>> + tree poffset;
>>> + int punsignedp, preversep, pvolatilep;
>>> +   

Re: [PATCH] Add security_sensitive attribute to clean function stack and regs.

2016-03-28 Thread Marcos Díaz
(Ping and changelog fix)

On Tue, Mar 22, 2016 at 11:15 AM, Marcos Díaz
 wrote:
> Hi,
>the attached patch adds a new attribute 'security_sensitive' for functions.
> The idea was discussed in PR middle-end/69976.
> This attribute makes gcc to emit clean up code at the function's epilogue.
> This clean-up code cleans the stack used by this function and that isn't
> needed anymore. It also cleans used registers. It only works in x86_64.
> Please, review the discussion here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69976
> since we had some doubts with the implementation.
>
> We also added some test-cases and ran all tests in x86_64.
> We think this isn't a bug-fix but a new feature.
>
> Changelog
> 2016-03-21  Marcos Diaz  
>Andres Tiraboschi  
>
> PR tree-optimization/69820
This line should've been  PR middle-end/69976
> * config/i386/i386-protos.h: Add ix86_clear_regs_emit and
> ix86_sec_sensitive_attr_p
> * config/i386/i386.c: (ix86_sec_sensitive_attr_p): New function
> (ix86_using_red_zone): now take into account if the function has the new
> attribute.
> (is_preserved_reg): New function.
> (is_integer_reg): New function.
> (is_used_as_ret): New function.
> (reg_to_string): New function.
> (clear_reg_emit): New function.
> (ix86_clear_regs_emit): New function.
> (ix86_expand_epilogue): Added code to emit clean up code only when
> security_sensitive attribute is set.
> (ix86_handle_security_sensitive_attribute): New function.
> (ix86_attribute_table): Added new attribute.
> * config/i386/i386.md: (UNSPECV_CLRSTACK): New unspecv.
> (UNSPECV_CLRREGS): New unspecv.
> (return): Conditionally emit cleaning regs code.
> (simple_return): Likewise
> (clear_regs): New insn.
> (clear_stack): New insn.
> * doc/extend.texi: Added description for new security_sensitive attribute.

Changelog
2016-03-21  Marcos Diaz  
   Andres Tiraboschi  

PR middle-end/69976
* config/i386/i386-protos.h: Add ix86_clear_regs_emit and
ix86_sec_sensitive_attr_p
* config/i386/i386.c: (ix86_sec_sensitive_attr_p): New function
(ix86_using_red_zone): now take into account if the function has the new
attribute.
(is_preserved_reg): New function.
(is_integer_reg): New function.
(is_used_as_ret): New function.
(reg_to_string): New function.
(clear_reg_emit): New function.
(ix86_clear_regs_emit): New function.
(ix86_expand_epilogue): Added code to emit clean up code only when
security_sensitive attribute is set.
(ix86_handle_security_sensitive_attribute): New function.
(ix86_attribute_table): Added new attribute.
* config/i386/i386.md: (UNSPECV_CLRSTACK): New unspecv.
(UNSPECV_CLRREGS): New unspecv.
(return): Conditionally emit cleaning regs code.
(simple_return): Likewise
(clear_regs): New insn.
(clear_stack): New insn.
* doc/extend.texi: Added description for new security_sensitive attribute.


Re: rs6000 stack_tie mishap again

2016-03-28 Thread Richard Henderson

On 03/23/2016 09:10 PM, Alan Modra wrote:

On Wed, Mar 23, 2016 at 05:04:39PM +0100, Olivier Hainque wrote:

The reason why 894 is not accounted in the base ref computation is because it
is part of the epilogue sequence, and init_alias_analysis has:

   /* Walk the insns adding values to the new_reg_base_value array.  */
   for (i = 0; i < rpo_cnt; i++)
{ ...
  if (could_be_prologue_epilogue
  && prologue_epilogue_contains (insn))
continue;

The motivation for this is unclear to me.


https://gcc.gnu.org/ml/gcc-patches/1999-08n/msg00048.html


My rough understanding is that we probably really care about frame_related
insns only here, at least on targets where the flag is supposed to be set
accurately.


Also, possibly just prologue insns.  So you might be able to modify
init_alias_analysis just to ignore the prologue and skip any need for
yet another hook.

Let's see what rth thinks.  He did say the patch might need to be
redone.  :)
https://gcc.gnu.org/ml/gcc-patches/1999-08n/msg00072.html


I be surprised if this is works as expected without side-effects.  You've now 
exposed the restore of the frame pointer to alias analysis, and it's probably 
not seen as constant anymore.  As you reference, I expect that any patch that 
opens the epilogue to such scrutiny is going to have to special-case the frame 
pointer as well.


That said, as Segher points out later in the thread, one can arrange for hard 
regs within the body to bleed into temporaries used within the epilogue, which 
is bad.  So perhaps this is exactly what's needed longer-term.  More 
investigation is required.


But I expect for stage4, the best solution is to strengthen the stack_tie 
pattern to block all memory.  Early scheduling of the stack frame deallocation 
(a simple logic insn) can't really be that important to performance.



r~


Re: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C

2016-03-28 Thread Ilya Verbin
Hi Thomas!

Do you plan to commit this patch? :)

On Mon, Sep 29, 2014 at 09:24:40 -0600, Jeff Law wrote:
> On 09/29/14 08:26, Thomas Schwinge wrote:
> >On Mon, 29 Sep 2014 13:58:31 +, "Tannenbaum, Barry M" 
> > wrote:
> >>In a nutshell, add the following code to main() before the call to f3():
> >>
> >> int status = __cilkrts_set_param("nworkers", "2");
> >> if (0 != status) {
> >> // Failed to set the number of Cilk workers
> >> return status;
> >> }
> >
> >Yeah, that's what I had proposed with the patch at the end of my previous
> >email,
> >.
> >I'm sorry if I didn't make it obvious that more text and the patch were
> >following after the full-quote of the original issue description.
> >
> >>Here's the details: [...]
> >
> >Thanks again for your helpful comments; that's appreciated.
> >
> >Here's again my proposed patch.  Note, that the include paths in GCC
> >compiler testing (gcc/testsuite/) are not set up to pick up the
> > include file, so I've manually added a propotype for
> >the __cilkrts_set_param function to the three files.  I can change that,
> >if requested.
> >
> >commit ee7138e451d1f3284d6fa0f61fe517c82db94060
> >Author: Thomas Schwinge 
> >Date:   Mon Sep 29 12:47:34 2014 +0200
> >
> > Audit Cilk Plus tests for CILK_NWORKERS=1.
> >
> > gcc/testsuite/
> > * c-c++-common/cilk-plus/CK/spawning_arg.c (main): Call
> > __cilkrts_set_param to set two workers.
> > * c-c++-common/cilk-plus/CK/steal_check.c (main): Likewise.
> > * g++.dg/cilk-plus/CK/catch_exc.cc (main): Likewise.
> OK.
> Jeff

  -- Ilya


Re: [Patch, Fortran, pr70397, gcc-5, v1] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-28 Thread Andre Vehreschild
Hi Paul,

thanks for the quick review. Committed to gcc-5-branch as r234507. The
patch for trunk needs more polishing than expected. I hope to present
it soon.

Regards,
Andre

On Sun, 27 Mar 2016 19:19:11 +0200
Paul Richard Thomas  wrote:

> Hi Andre,
> 
> The patch looks to be fine to me for both trunk and 5-branch.
> 
> Thanks for the patch.
> 
> Paul
> 
> On 27 March 2016 at 18:53, Andre Vehreschild  wrote:
> > Hi all,
> >
> > and here is already the follow-up. In the initial patch a safe wasn't 
> > commenced
> > before pulling the patch, which lead to a refactoring of the new functions 
> > node
> > to be partial only. Sorry for the noise.
> >
> > - Andre
> >
> > Am Sun, 27 Mar 2016 18:49:18 +0200
> > schrieb Andre Vehreschild :
> >  
> >> Hi all,
> >>
> >> attached is a patch to fix an ICE on allocating an unlimited polymorphic
> >> entity from a non-poly class or type without an length component. The 
> >> routine
> >> gfc_copy_class_to_class() assumed that both the source and destination
> >> object's type is unlimited polymorphic, but in this case it is true for the
> >> destination only, which made gfortran look for a non-existent _len 
> >> component
> >> in the source object and therefore ICE. This is fixed by the patch by 
> >> adding
> >> a function to return either the _len component, when it exists, or a 
> >> constant
> >> zero node to init the destination object's _len component with.
> >>
> >> Bootstrapped and regtested ok on x86_64-linux-gnu/F23. (Might have some
> >> line deltas, because my git is a bit older. Sorry, only have limited/slow
> >> net-access currently.)
> >>
> >> The same patch should be adaptable to trunk. To come...
> >>
> >> Ok for 5-trunk?
> >>
> >> Regards,
> >>   Andre  
> >
> >
> >



-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 234506)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,12 @@
+2016-03-28  Andre Vehreschild  
+
+	PR fortran/70397
+	* trans-expr.c (gfc_class_len_or_zero_get): Add function to return a
+	constant zero tree, when the class to get the _len component from is
+	not unlimited polymorphic.
+	(gfc_copy_class_to_class): Use the new function.
+	* trans.h: Added interface of new function gfc_class_len_or_zero_get.
+
 2016-03-28  Alessandro Fanfarillo  
 
 	Backport from trunk.
Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c	(Revision 234506)
+++ gcc/fortran/trans-expr.c	(Arbeitskopie)
@@ -173,6 +173,24 @@
 }
 
 
+/* Try to get the _len component of a class.  When the class is not unlimited
+   poly, i.e. no _len field exists, then return a zero node.  */
+
+tree
+gfc_class_len_or_zero_get (tree decl)
+{
+  tree len;
+  if (POINTER_TYPE_P (TREE_TYPE (decl)))
+decl = build_fold_indirect_ref_loc (input_location, decl);
+  len = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (decl)),
+			   CLASS_LEN_FIELD);
+  return len != NULL_TREE ? fold_build3_loc (input_location, COMPONENT_REF,
+	 TREE_TYPE (len), decl, len,
+	 NULL_TREE)
+			  : integer_zero_node;
+}
+
+
 /* Get the specified FIELD from the VPTR.  */
 
 static tree
@@ -250,6 +268,7 @@
 
 #undef CLASS_DATA_FIELD
 #undef CLASS_VPTR_FIELD
+#undef CLASS_LEN_FIELD
 #undef VTABLE_HASH_FIELD
 #undef VTABLE_SIZE_FIELD
 #undef VTABLE_EXTENDS_FIELD
@@ -1070,7 +1089,7 @@
   if (unlimited)
 {
   if (from_class_base != NULL_TREE)
-	from_len = gfc_class_len_get (from_class_base);
+	from_len = gfc_class_len_or_zero_get (from_class_base);
   else
 	from_len = integer_zero_node;
 }
Index: gcc/fortran/trans.h
===
--- gcc/fortran/trans.h	(Revision 234506)
+++ gcc/fortran/trans.h	(Arbeitskopie)
@@ -356,6 +356,7 @@
 tree gfc_class_data_get (tree);
 tree gfc_class_vptr_get (tree);
 tree gfc_class_len_get (tree);
+tree gfc_class_len_or_zero_get (tree);
 gfc_expr * gfc_find_and_cut_at_last_class_ref (gfc_expr *);
 /* Get an accessor to the class' vtab's * field, when a class handle is
available.  */
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 234506)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,9 @@
+2016-03-28  Andre Vehreschild  
+
+	PR fortran/70397
+	* gfortran.dg/unlimited_polymorphic_25.f90: New test.
+	* gfortran.dg/unlimited_polymorphic_26.f90: New test.
+
 2016-03-28  Kirill Yukhin  
 
 	PR target/70406
Index: gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
===
--- gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90	(nicht existent)
+++ gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90	(Arbeitskopie)
@@ -0,0 +1,40 @@
+! { dg-do run }
+!
+! Test contributed by Valery Weber  
+
+module mod
+
+  TYPE, PUB

Re: C++ PATCH for c++/70353 (core issue 1962)

2016-03-28 Thread Dominique d'Humières

> Le 28 mars 2016 à 15:11, Jason Merrill  a écrit :
> 
> On 03/28/2016 09:02 AM, Dominique d'Humières wrote:
>> 
>>> Le 28 mars 2016 à 14:55, Jason Merrill  a écrit :
>>> 
>>> OK, thanks.
>>> 
>>> Jason
>> 
>> Does it mean that I should commit the patch?
> 
> Please.

Revision 234504.

Dominique

> Jason
> 



Re: C++ PATCH for c++/70353 (core issue 1962)

2016-03-28 Thread Jason Merrill

On 03/28/2016 09:02 AM, Dominique d'Humières wrote:



Le 28 mars 2016 à 14:55, Jason Merrill  a écrit :

OK, thanks.

Jason


Does it mean that I should commit the patch?


Please.

Jason




Re: C++ PATCH for c++/70353 (core issue 1962)

2016-03-28 Thread Dominique d'Humières

> Le 28 mars 2016 à 14:55, Jason Merrill  a écrit :
> 
> OK, thanks.
> 
> Jason

Does it mean that I should commit the patch?

Dominique



Re: C++ PATCH for c++/70353 (core issue 1962)

2016-03-28 Thread Jason Merrill

OK, thanks.

Jason


[PATCH] Fix in-tree gmp/mpfr/mpc generation (PR 67728)

2016-03-28 Thread Bernd Edlinger

Hi,

as described in the tracker we have bootstrap problems with in-tree gmp-6.1.0
on certain targets, and also a linker issue with check-mpc due to the changed
mpfr library path.

These are triggered by overriding CFLAGS and LDFLAGS in in-tree builds.
It did not happen with the gmp/mpfr/mpc versions that download_prerequisites
installs, but the currently latest version of these libraries use CFLAGS to pass
-DNO_ASM which is overridden by gcc and causes the gmp-6.1.0 to be
mis-compiled.  And the mpc issue is triggered by overriding LDFLAGS
and the changed mpfr library path.  So this started with mpfr v3.1.0 which
moved the sources into a src sub-directory.

The proposed patch fixes these problems by passing -DNO_ASM in AM_CFLAGS,
and adding both possible mpfr library paths to HOST_LIB_PATH_mpfr.
I've also adjusted HOST_LIB_PATH_mpc although it did not yet create problems.

Boot-strapped and regression tested on x86_64-pc-linux-gnu, with different
gmp versions including the latest snapshot.
I have additionally built arm cross compilers, which was not working before.

Is this OK for trunk?


Thanks
Bernd.2016-03-28  Bernd Edlinger  

PR bootstrap/67728
* Makefile.def (gmp): Explicitly disable assembler.
(mpfr): Add src/.libs to lib_path2.
(mpc): Adjust lib_path.
* Makefile.tpi (HOST_LIB_PATH_): Use lib_path2 if defined.
* Makefile.in: Regenerated.
Index: Makefile.def
===
--- Makefile.def	(revision 234490)
+++ Makefile.def	(working copy)
@@ -50,6 +50,7 @@ host_modules= { module= gcc; bootstrap=true;
 host_modules= { module= gmp; lib_path=.libs; bootstrap=true;
 		// Work around in-tree gmp configure bug with missing flex.
 		extra_configure_flags='--disable-shared LEX="touch lex.yy.c"';
+		extra_make_flags='AM_CFLAGS="-DNO_ASM"';
 		no_install= true;
 		// none-*-* disables asm optimizations, bootstrap-testing
 		// the compiler more thoroughly.
@@ -57,11 +58,11 @@ host_modules= { module= gmp; lib_path=.libs; boots
 		// gmp's configure will complain if given anything
 		// different from host for target.
 	target="none-${host_vendor}-${host_os}"; };
-host_modules= { module= mpfr; lib_path=.libs; bootstrap=true;
+host_modules= { module= mpfr; lib_path=.libs; lib_path2=src/.libs; bootstrap=true;
 		extra_configure_flags='--disable-shared @extra_mpfr_configure_flags@';
 		extra_make_flags='AM_CFLAGS="-DNO_ASM"';
 		no_install= true; };
-host_modules= { module= mpc; lib_path=.libs; bootstrap=true;
+host_modules= { module= mpc; lib_path=src/.libs; bootstrap=true;
 		extra_configure_flags='--disable-shared @extra_mpc_gmp_configure_flags@ @extra_mpc_mpfr_configure_flags@';
 		no_install= true; };
 host_modules= { module= isl; lib_path=.libs; bootstrap=true;
Index: Makefile.in
===
--- Makefile.in	(revision 234490)
+++ Makefile.in	(working copy)
@@ -639,12 +639,12 @@ HOST_LIB_PATH_gmp = \
 
 @if mpfr
 HOST_LIB_PATH_mpfr = \
-  $$r/$(HOST_SUBDIR)/mpfr/.libs:$$r/$(HOST_SUBDIR)/prev-mpfr/.libs:
+  $$r/$(HOST_SUBDIR)/mpfr/src/.libs:$$r/$(HOST_SUBDIR)/mpfr/.libs:$$r/$(HOST_SUBDIR)/prev-mpfr/.libs:
 @endif mpfr
 
 @if mpc
 HOST_LIB_PATH_mpc = \
-  $$r/$(HOST_SUBDIR)/mpc/.libs:$$r/$(HOST_SUBDIR)/prev-mpc/.libs:
+  $$r/$(HOST_SUBDIR)/mpc/src/.libs:$$r/$(HOST_SUBDIR)/prev-mpc/src/.libs:
 @endif mpc
 
 @if isl
@@ -11299,7 +11299,7 @@ all-gmp: configure-gmp
 	s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
 	$(HOST_EXPORTS)  \
 	(cd $(HOST_SUBDIR)/gmp && \
-	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) $(STAGE1_FLAGS_TO_PASS)  \
+	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) $(STAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" \
 		$(TARGET-gmp))
 @endif gmp
 
@@ -11328,7 +11328,7 @@ all-stage1-gmp: configure-stage1-gmp
 		CXXFLAGS_FOR_TARGET="$(CXXFLAGS_FOR_TARGET)" \
 		LIBCFLAGS_FOR_TARGET="$(LIBCFLAGS_FOR_TARGET)" \
 		$(EXTRA_HOST_FLAGS)  \
-		$(STAGE1_FLAGS_TO_PASS)  \
+		$(STAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" \
 		TFLAGS="$(STAGE1_TFLAGS)" \
 		$(TARGET-stage1-gmp)
 
@@ -11343,7 +11343,7 @@ clean-stage1-gmp:
 	fi; \
 	cd $(HOST_SUBDIR)/gmp && \
 	$(MAKE) $(EXTRA_HOST_FLAGS)  \
-	$(STAGE1_FLAGS_TO_PASS)  clean
+	$(STAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" clean
 @endif gmp-bootstrap
 
 
@@ -11370,7 +11370,7 @@ all-stage2-gmp: configure-stage2-gmp
 		CFLAGS_FOR_TARGET="$(CFLAGS_FOR_TARGET)" \
 		CXXFLAGS_FOR_TARGET="$(CXXFLAGS_FOR_TARGET)" \
 		LIBCFLAGS_FOR_TARGET="$(LIBCFLAGS_FOR_TARGET)" \
-		$(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS)  \
+		$(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" \
 		TFLAGS="$(STAGE2_TFLAGS)" \
 		$(TARGET-stage2-gmp)
 
@@ -11384,7 +11384,7 @@ clean-stage2-gmp:
 	  $(MAKE) stage2-start; \
 	fi; \
 	cd $(HOST_SUBDIR)/gmp && \
-	$(MAKE) $(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS)  clean
+	$(MAKE) $(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" clean
 @en

Re: [Patch, Fortran] STOP issue with coarrays

2016-03-28 Thread Alessandro Fanfarillo
Thanks for the quick feedback.

The patch has been committed on trunk as rev. 234502 and on
gcc-5-branch as rev. 234503 .

In attachment three simple test cases; coarray_stop_str.f90 fails
because of a syntax error in the check string.
I guess the problem is due to the presence of the "STOPPED" string
inside the check string; how can I manage this situation?

Thanks,

Alessandro

PS: Paul, I see this as a good exercise before working on Failed Images :-).


2016-03-27 19:14 GMT+02:00 Paul Richard Thomas :
> Hi Alessandro,
>
> The patch is fine for trunk and 5-branch - going on trivial, I would
> say! Are you going to add the testcase?
>
> Thanks a lot! I am impressed that you are doing these between
> celebrating your doctorate and preparing for your move :-)
>
> Paul
>
> On 27 March 2016 at 17:10, Alessandro Fanfarillo
>  wrote:
>> Dear all,
>>
>> the attached patch fixes the issue reported by Anton Shterenlikht
>> (https://gcc.gnu.org/ml/fortran/2016-03/msg00037.html). The compiler
>> delegates the external library to manage the STOP statement in case
>> -fcoarray=lib is used.
>>
>> Built and regtested on x86_64-pc-linux-gnu.
>>
>> Ok for trunk and gcc-5-branch?
>
>
>
> --
> The difference between genius and stupidity is; genius has its limits.
>
> Albert Einstein
! { dg-do compile }
! { dg-options "-fdump-tree-original -fcoarray=lib" }
!
! STOP managed by the external library
!
implicit none
  STOP
end

! { dg-final { scan-tree-dump-times "_gfortran_caf_stop_str \\(0B, 0\\);" 1 "original" } }
! { dg-do compile }
! { dg-options "-fdump-tree-original -fcoarray=lib" }
!
! STOP managed by the external library
!
implicit none
  STOP 100
end

! { dg-final { scan-tree-dump-times "_gfortran_caf_stop_numeric \\(100\\);" 1 "original" } }
! { dg-do compile }
! { dg-options "-fdump-tree-original -fcoarray=lib" }
!
! STOP managed by the external library
!
implicit none
  STOP "STOPPED"
end

! { dg-final { scan-tree-dump-times "_gfortran_caf_stop_str \\(&\\"STOPPED\\"[1]\\{lb: 1 sz: 1\\}, 7\\);" 1 "original" } }


Re: rs6000 stack_tie mishap again

2016-03-28 Thread Segher Boessenkool
On Mon, Mar 28, 2016 at 12:45:03PM +0200, Olivier Hainque wrote:
> > The normal -m32 compiler here generates code like
> > 
> > lwz 11,0(1)
> > 
> > and try as I might I cannot get it to fail.  Maybe because the GPR11
> > setup here involves a load?
> 
> You need to have had r11 last used to designate a global
> symbol as part of the function body (in order to have base_term
> designate a symbol_ref etc), and then have the scheduler
> decide that moving across is a good idea. It's certainly not
> an easy combination to trigger.

Yes, I did that (with some asm's).  Like this:

===
void g(int, char *);

int dum;

void f(int x)
{
char big[20];
g(x, big);
g(x, big);
register void *p asm("r11") = &dum;
asm("" : : "r"(p));
}
===

and the end of that becomes

===
bl g # 11   *call_nonlocal_sysvsi/1 [length = 4]
lis 11,dum@ha# 12   elf_high[length = 4]
la 11,dum@l(11)  # 13   elf_low [length = 4]
lwz 11,0(1)  # 33   *movsi_internal1/3  [length = 4]
lwz 0,4(11)  # 34   *movsi_internal1/3  [length = 4]
lwz 31,-4(11)# 36   *movsi_internal1/3  [length = 4]
mr 1,11  # 38   *movsi_internal1/1  [length = 4]
mtlr 0   # 35   *movsi_internal1/9  [length = 4]
blr  # 39   *return_internal_si [length = 4]
===

> > It seems clear that just considering the
> > prologue is enough to fix the original problem (frame pointer setup),
> > and problems like yours cannot happen in the prologue.
> 
> Right. What is unclear is if it's correct to consider only
> prologues here. ISTM we arrange to produce CFI for epilogues
> as well today, at least in some circumstances, and maybe the
> issue Richard had with prologues at the time would need to
> be addressed for epilogues as well today.

Correct is to do alias analysis in both the prologue and epilogue.
If we don't, ports have to do all kinds of stack ties etc. to fake it.

Currently we do neither, so doing just one is a step in the right
direction.

> > Better would be not to have this hack at all.
> 
> Sure.
> 
> >> My rough understanding is that we probably really care about frame_related
> >> insns only here, at least on targets where the flag is supposed to be set
> >> accurately.
> > 
> > On targets with DWARF2 unwind info the flag should be set on those insns
> > that need unwind info.  This does not include all insns in the epilogue
> > that access the frame, so I don't think this will help you?
> 
> My idea was that, maybe, the insns we need to see for alias analysis
> are precisely those for which the bit should not be set, which happened
> to be exactly the case in the problematic situation we hit. But as I said,
> I'm not 100% convinced the reasoning is globally correct.

All the register restores do not have the flag set, in most cases.
But they can in others (say, a target that does not optimise the CFI
stuff very well).

> >> This is the basis of the proposed patch, which essentially disconnects the
> >> shortcut above for !frame_related insns on targets which need precise alias
> >> analysis within epilogues, as indicated by a new target hook.
> > 
> > Eww.  Isn't that really all targets that schedule the epilogue at all?
> 
> Yes. Most of them use stronger barriers which the dependency analyser
> knows about, so aren't affected by this. 
> 
> That's a possible alternative approach for rs6000.

A clobber of scratch should work yes.  It's really heavy handed though,
we can move the GPR1 restore quite freely otherwise (in shrink-wrap),
but perhaps allowing scheduling to move it doesn't buy us much at all.

This doesn't solve the problem however that other dependencies in the
epilogue can be messed up in similar ways.


Segher


Re: rs6000 stack_tie mishap again

2016-03-28 Thread Olivier Hainque

> On Mar 28, 2016, at 05:01 , Segher Boessenkool  
> wrote:
> 
> The normal -m32 compiler here generates code like
> 
>   lwz 11,0(1)
> 
> and try as I might I cannot get it to fail.  Maybe because the GPR11
> setup here involves a load?

You need to have had r11 last used to designate a global
symbol as part of the function body (in order to have base_term
designate a symbol_ref etc), and then have the scheduler
decide that moving across is a good idea. It's certainly not
an easy combination to trigger.

>> We have observed this with a gcc 4.7 back-end and weren't able to reproduce
>> with a more recent version.
> 
> This makes it not a regression and thus out of scope for GCC 6.  We're
> supposed to stabilise things now ;-)

Sure, no problem if this is only for gcc 7. I posted the message
while the details were still fresh in my mind.

>>if (could_be_prologue_epilogue
>>&& prologue_epilogue_contains (insn))
>>  continue;
>> 
>> The motivation for this is unclear to me.
> 
> Alan linked to the history.

Right


>  It seems clear that just considering the
> prologue is enough to fix the original problem (frame pointer setup),
> and problems like yours cannot happen in the prologue.

Right. What is unclear is if it's correct to consider only
prologues here. ISTM we arrange to produce CFI for epilogues
as well today, at least in some circumstances, and maybe the
issue Richard had with prologues at the time would need to
be addressed for epilogues as well today.

> Better would be not to have this hack at all.

Sure.

>> My rough understanding is that we probably really care about frame_related
>> insns only here, at least on targets where the flag is supposed to be set
>> accurately.
> 
> On targets with DWARF2 unwind info the flag should be set on those insns
> that need unwind info.  This does not include all insns in the epilogue
> that access the frame, so I don't think this will help you?

My idea was that, maybe, the insns we need to see for alias analysis
are precisely those for which the bit should not be set, which happened
to be exactly the case in the problematic situation we hit. But as I said,
I'm not 100% convinced the reasoning is globally correct.

>> This is the basis of the proposed patch, which essentially disconnects the
>> shortcut above for !frame_related insns on targets which need precise alias
>> analysis within epilogues, as indicated by a new target hook.
> 
> Eww.  Isn't that really all targets that schedule the epilogue at all?

Yes. Most of them use stronger barriers which the dependency analyser
knows about, so aren't affected by this. 

That's a possible alternative approach for rs6000.

Thanks for your feedback,

Olivier



[PATCH, libgomp] Fix deadlock in acc_set_device_type

2016-03-28 Thread Chung-Lin Tang
Hi Jakub, there's a path for deadlock on acc_device_lock when going
through the acc_set_device_type() OpenACC library function.
Basically, the gomp_init_targets_once() function should not be
called with that held. The attached patch moves it appropriately.

Also in this patch, there are several cases in acc_* functions
where gomp_init_targets_once() is guarded by a test of
!cached_base_dev. Since that function already uses pthread_once() to
call gomp_target_init(), and technically cached_base_dev
is protected by acc_device_lock, the cleanest way should be to
simply drop those "if(!cached_base_dev)" tests.

Tested libgomp without regressions on an nvptx offloaded system,
is this okay for trunk?

Thanks,
Chung-Lin

2016-03-28  Chung-Lin Tang  

* oacc-init.c (acc_init): Remove !cached_base_dev condition on call to
gomp_init_targets_once().
(acc_set_device_type): Remove !cached_base_dev condition on call to
gomp_init_targets_once(), move call to before acc_device_lock acquire,
to avoid deadlock.
(acc_get_device_num): Remove !cached_base_dev condition on call to
gomp_init_targets_once().
(acc_set_device_num): Likewise.
Index: oacc-init.c
===
--- oacc-init.c (revision 234502)
+++ oacc-init.c (working copy)
@@ -433,8 +433,7 @@ goacc_attach_host_thread_to_device (int ord)
 void
 acc_init (acc_device_t d)
 {
-  if (!cached_base_dev)
-gomp_init_targets_once ();
+  gomp_init_targets_once ();
 
   gomp_mutex_lock (&acc_device_lock);
 
@@ -498,11 +497,10 @@ acc_set_device_type (acc_device_t d)
   struct gomp_device_descr *base_dev, *acc_dev;
   struct goacc_thread *thr = goacc_thread ();
 
+  gomp_init_targets_once ();
+
   gomp_mutex_lock (&acc_device_lock);
 
-  if (!cached_base_dev)
-gomp_init_targets_once ();
-
   cached_base_dev = base_dev = resolve_device (d, true);
   acc_dev = &base_dev[goacc_device_num];
 
@@ -563,8 +561,7 @@ acc_get_device_num (acc_device_t d)
   if (d >= _ACC_device_hwm)
 gomp_fatal ("unknown device type %u", (unsigned) d);
 
-  if (!cached_base_dev)
-gomp_init_targets_once ();
+  gomp_init_targets_once ();
 
   gomp_mutex_lock (&acc_device_lock);
   dev = resolve_device (d, true);
@@ -584,8 +581,7 @@ acc_set_device_num (int ord, acc_device_t d)
   struct gomp_device_descr *base_dev, *acc_dev;
   int num_devices;
 
-  if (!cached_base_dev)
-gomp_init_targets_once ();
+  gomp_init_targets_once ();
 
   if (ord < 0)
 ord = goacc_device_num;


Re: Proposed Patch for Bug 69687

2016-03-28 Thread Marcel Böhme
Hi Bernd: I submitted the filled disclaimer form on March 4th. Now, a 
representative of FSF mentioned that the copyright assignment is ready on their 
end.

Can someone please go ahead and review the patch?

Best regards,
- Marcel


> On 4 Mar 2016, at 1:43 AM, Bernd Schmidt  wrote:
> 
> On 03/03/2016 04:18 PM, Mike Stump wrote:
>> On Mar 3, 2016, at 6:55 AM, Marcel Böhme  wrote:
>>> I have revised the patch and removed the limits.
>> 
>> I looked at the patch, I can find no more unreasonable limits!  Wonderful.  
>> Hope someone will finish off the review and approve.
> 
> Marcel, if you haven't contributed before, we'll need a copyright assignment 
> from you before we can accept the patch. Do you already have one on file?
> 
> 
> Bernd
>