date:20140409

Re: RFA: Testsuite PATCH to add support for dlopen tests

2014-04-09 Thread Andreas Schwab

Jason Merrill ja...@redhat.com writes:

 Hmm, the PCH tests already use nested calls to dg-test,

Do they?  I don't think so.  There are calls to dg-test in dg-flags-pch,
which is called by dg-pch, and then pch.exp runs dg-pch on each test,
but I see no other dg-test in the call chain.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-04-09 Thread Svante Signell

On Fri, 2014-04-04 at 21:14 +0200, Samuel Thibault wrote:
 Hello,
 
 Thomas Schwinge, le Wed 26 Jun 2013 23:30:03 +0200, a écrit :
  On Sat, 22 Jun 2013 08:15:46 -0700, Ian Lance Taylor i...@google.com 
  wrote:
   Go can work without split stack.  In that case libgo will use much
   larger stacks for goroutines, to reduce the chance of running out of
   stack space (see StackMin in libgo/runtime/proc.c).  So the number of
   simultaneous goroutines that can be run will be limited.  This is
   usually OK on x86_64 but it does hamper Go programs running on 32-bit
   x86.
  
  OK, but that's not the most pressing issue we're having right now.
  Anyway, as it stands, the split-stack code doesn't work on Hurd, so I
  disabled it in r200434 as follows:
 
 Maybe you'd want to re-enable it, now that we have got rid of threadvars :)

I don't think it is a good idea. I've patched gcc-4.9-20140406 to make
gccgo build and tested with -fsplit-stack enabled (with and without the
gold linker). Without split stack around 70 libgo tests pass and 50
fails. With it enabled all tests fail. Simple examples are the following
C code (from Thomas) and GO code:

1)
cat test_split_stack.c 
#include mach.h
#include stdio.h

int main(void)
{
  int err;
  vm_address_t addr = 0;

  int i;
  for (i = 0; i  3; ++i)
{
  err = vm_allocate(mach_task_self(), addr, 4096, 1);
  printf(%u %p\n, err, (void *) addr);
}
  return 0;
}
$ gcc-4.9 test_split_stack.c -fsplit-stack
$ ./a.out
0 (nil)
0 0x102c000
0 0x1027800

$ gcc-4.9 test_split_stack.c
$ ./a.out
0 0x102c000
0 0x125b000
0 0x125c000

2)
cat hello.go:
package main

import fmt

func main() {
fmt.Printf(Hello, world.\n)
}

gccgo-4.9 -g hello.go
 ./a.out
Hello, world.

LD_PRELOAD=../gcc-4.9-4.9-20140406/build/i486-gnu/libgo/.libs/libgo.so.5.0.0 
./a.out
mmap errno 1073741846
fatal error: mmap

runtime stack:
^C

Something is still not OK with the treads library?

Patches for gccgo on GNU/Hurd will be submitted to the Debian BTS.

Re: GCC's -fsplit-stack disturbing Mach's vm_allocate

2014-04-09 Thread Thomas Schwinge

Hi!

On Wed, 9 Apr 2014 09:05:46 +0200, Svante Signell svante.sign...@gmail.com 
wrote:
 On Fri, 2014-04-04 at 21:14 +0200, Samuel Thibault wrote:
  Thomas Schwinge, le Wed 26 Jun 2013 23:30:03 +0200, a écrit :
   On Sat, 22 Jun 2013 08:15:46 -0700, Ian Lance Taylor i...@google.com 
   wrote:
Go can work without split stack.  In that case libgo will use much
larger stacks for goroutines, to reduce the chance of running out of
stack space (see StackMin in libgo/runtime/proc.c).  So the number of
simultaneous goroutines that can be run will be limited.  This is
usually OK on x86_64 but it does hamper Go programs running on 32-bit
x86.
   
   OK, but that's not the most pressing issue we're having right now.
   Anyway, as it stands, the split-stack code doesn't work on Hurd, so I
   disabled it in r200434 as follows:
  
  Maybe you'd want to re-enable it, now that we have got rid of threadvars :)
 
 I don't think it is a good idea. I've patched gcc-4.9-20140406 to make
 gccgo build and tested with -fsplit-stack enabled (with and without the
 gold linker). Without split stack around 70 libgo tests pass and 50
 fails. With it enabled all tests fail. [...]

 LD_PRELOAD=../gcc-4.9-4.9-20140406/build/i486-gnu/libgo/.libs/libgo.so.5.0.0 
 ./a.out
 mmap errno 1073741846
 fatal error: mmap

Well, the first step is to verify that TARGET_THREAD_SPLIT_STACK_OFFSET
and similar configury is correct for the Hurd, and figure out what's
going on with the zero-page unmapping (discussed earlier in this thread),
and then mmap failing with 1073741846 (EINVAL).


 Patches for gccgo on GNU/Hurd will be submitted to the Debian BTS.

Just a suggestion, but in my opinion, it'd make more sense to first get
such patches integrated upstream.  (Same also for the GCC Ada patches.)
GCC Go support (as well as Ada) clearly is not a critical thing to first
get into Debian GNU/Hurd, and the total maintenance/integration overhead
would be lower if these patches would just percolate into Debian GCC from
upstream.


Grüße,
 Thomas


pgpks_YJv0lPA.pgp
Description: PGP signature

Re: [PATCH] Fix for PR libstdc++/60758

2014-04-09 Thread Alexey Merzlyakov


On 04.04.2014 14:44, Alexey Merzlyakov wrote:

Hi all,

Here is a patch, that fixes infinite backtraces in __cxa_end_cleanup().
The Bugzilla entry for 
this:http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60758


The __cxa_end_cleanup() does not save/restore LR in function 
header/footer and does not provide any unwind info,
which causes problems with GDB and other tools (e.g. unwind code in 
libgcc, libbacktrace, etc.).


Best regards,
Merzlyakov Alexey

2014-04-03  Alexey Merzlyakov alexey.merzlya...@samsung.com

PR libstdc++/60758
* libsupc++/eh_arm.cc (__cxa_end_cleanup): Add LR save/restore.

diff --git a/libstdc++-v3/libsupc++/eh_arm.cc 
b/libstdc++-v3/libsupc++/eh_arm.cc

index aa453dd..ead1e61 100644
--- a/libstdc++-v3/libsupc++/eh_arm.cc
+++ b/libstdc++-v3/libsupc++/eh_arm.cc
@@ -206,9 +206,9 @@ asm (  .pushsection .text.__cxa_end_cleanup\n
 .type __cxa_end_cleanup, \function\\n
 .thumb_func\n
 __cxa_end_cleanup:\n
-push\t{r1, r2, r3, r4}\n
+push\t{r1, r2, r3, r4, lr}\n
 bl\t__gnu_end_cleanup\n
-pop\t{r1, r2, r3, r4}\n
+pop\t{r1, r2, r3, r4, lr}\n
 bl\t_Unwind_Resume @ Never returns\n
 .popsection\n);
 #else
@@ -216,9 +216,9 @@ asm (  .pushsection .text.__cxa_end_cleanup\n
 .global __cxa_end_cleanup\n
 .type __cxa_end_cleanup, \function\\n
 __cxa_end_cleanup:\n
-stmfd\tsp!, {r1, r2, r3, r4}\n
+stmfd\tsp!, {r1, r2, r3, r4, lr}\n
 bl\t__gnu_end_cleanup\n
-ldmfd\tsp!, {r1, r2, r3, r4}\n
+ldmfd\tsp!, {r1, r2, r3, r4, lr}\n
 bl\t_Unwind_Resume @ Never returns\n
 .popsection\n);
 #endif



Forgot to mention:
the patch has been tested on ARM - no regressions.

Best regards,
Merzlyakov Alexey

Re: [PATCH][C++] Fix PR60761, diagnostics in clones

2014-04-09 Thread Richard Biener

On April 8, 2014 8:03:08 PM CEST, Jason Merrill ja...@redhat.com wrote:
On 04/08/2014 07:58 AM, Richard Biener wrote:
 Jason, is clone good or shall I use sth else (do we annotate
in-charge vs. not in-charge
 constructors specially for example?).

The names of the in-charge and not-in-charge constructor clones are 
complete_ctor_identifier and base_ctor_identifier (and dtor for 
destructors); you could check for those.

I was more asking for how we present those To the user in diagnostics. I wanted 
to use a consistent 'quoting' style. If using clone is fine then I'll just 
stick to that.

OK for trunk?
Thanks,
Richard.

Jason

Re: [PATCH][C++] Fix PR60761, diagnostics in clones

2014-04-09 Thread Martin Jambor

Hi,

On Tue, Apr 08, 2014 at 01:58:06PM +0200, Richard Biener wrote:
 
 This fixes PR60761 by dumping decl context of function clones
 as origin with clone appended instead of built-in that now
 appears after we (compared to 4.8) clear DECL_LANG_SPECIFIC.
 
 Thus for the testcase in PR60761 we now print
 
 t.ii: In function 'void foo(int) clone':
 t.ii:14:13: warning: iteration 3u invokes undefined behavior 
 [-Waggressive-loop-optimizations]
  z[i] = i;
  ^
 t.ii:13:3: note: containing loop
for (int i = 0; i  s; i++)
^
 t.ii:14:8: warning: array subscript is above array bounds [-Warray-bounds]
  z[i] = i;
 ^
 
 instead of
 
 t.ii: In function ‘built-in’:
 t.ii:14:13: warning: iteration 3u invokes undefined behavior 
 [-Waggressive-loop-optimizations]
  z[i] = i;
  ^
 t.ii:13:3: note: containing loop
for (int i = 0; i  s; i++)
^
 t.ii:14:8: warning: array subscript is above array bounds [-Warray-bounds]
  z[i] = i;
 ^
 
 or with 4.8
 
 t.ii: In function ‘void _Z3fooi.constprop.0()’:
 t.ii:14:8: warning: array subscript is above array bounds [-Warray-bounds]
  z[i] = i;
 ^
 
 IMHO an improvement over both variants.
 
 Bootstrap and regtest running on x86_64-unknown-linux-gnu.
 
 Honza - does -former_clone_of apply recursively or do I have to
 loop to find the ultimate clone-of?  Jason, is clone good
 or shall I use sth else (do we annotate in-charge vs. not in-charge
 constructors specially for example?).
 
 Ok?
 
 Thanks,
 Richard.
 
 2014-04-08  Richard Biener  rguent...@suse.de
 
   cp/
   * error.c: Include cgraph.h
   (dump_decl): Print function clones as their origin plus clone
   appended instead of just built-in.
 
 Index: gcc/cp/error.c
 ===
 *** gcc/cp/error.c(revision 209210)
 --- gcc/cp/error.c(working copy)
 *** along with GCC; see the file COPYING3.
 *** 34,39 
 --- 34,40 
   #include pointer-set.h
   #include c-family/c-objc.h
   #include ubsan.h
 + #include cgraph.h
   
   #include new// For placement-new.
   
 *** dump_decl (cxx_pretty_printer *pp, tree
 *** 1145,1151 
   
   case FUNCTION_DECL:
 if (! DECL_LANG_SPECIFIC (t))
 ! pp_string (pp, M_(built-in));
 else if (DECL_GLOBAL_CTOR_P (t) || DECL_GLOBAL_DTOR_P (t))
   dump_global_iord (pp, t);
 else
 --- 1146,1162 
   
   case FUNCTION_DECL:
 if (! DECL_LANG_SPECIFIC (t))
 ! {
 !   cgraph_node *node;
 !   if ((node = cgraph_get_node (t))
 !node-former_clone_of)
 ! {
 !   dump_decl (pp, node-former_clone_of, flags);
 !   pp_string (pp, M_( clone));
 ! }
 !   else
 ! pp_string (pp, M_(built-in));
 ! }
 else if (DECL_GLOBAL_CTOR_P (t) || DECL_GLOBAL_DTOR_P (t))
   dump_global_iord (pp, t);
 else

I think you should use DECL_ABSTRACT_ORIGIN instead of
former_clone_of.  Not only you avoid using cgraph stuff here but
unlike this patch, it also works for IPA-CP clones of IPA-SRA clones
(yeah, I know, but I bet I can cause the same havoc by ipa-split
instead of ipa-sra, just not as easily).

The testcase is simmilar:

extern int sum;

void do_sum (char *p)
{
  for (int i = 0; i  2; i++)
sum += p[i];
}

static void
foo (int s, int unused)
{
  char z[3];
  for (int i = 0; i  s; i++)
z[i] = i;
  do_sum (z);
}

int
bar (int i)
{
  foo (4, 3);
  return 0;
}


Thanks,

Martin

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-04-09 Thread Jakub Jelinek

On Fri, Apr 04, 2014 at 10:38:49AM -0500, Bill Schmidt wrote:
 Thanks to everyone who helped with development, testing, and review of
 the patch set!  I've committed the changes to 4.8 this morning.  Note
 that patch 15/26 was rejected as not really germane to this series and
 has been submitted separately by Peter Bergner.

While trying to merge this to redhat/gcc-4_8-branch, I've so far noticed
that you have merged in the r199972 change (apparently without ChangeLog entry),
without r202642 change that reverted it later on.
Can you please revert that one liner change?

Jakub

RE: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend

2014-04-09 Thread Robert Suchanek

 FYI, all other targets that have LRA optionally selectable or deselectable
 use -mno-lra for this (even when -mlra is the default), it would be better
 for consistency not to invent new switch names for that.

Agreed.

 -return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) 
 == 8;
 +return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
  
return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
  }
 Not sure about this one.  We would need to update the comment that
 explains why !strict_p is there, but AFAIK reason (1) would still apply.

 Was this needed for correctness or because it gave better code?

!strict_p has been removed because of correctness issue. When LRA validates 
memory addresses pseudos are temporarily eliminated to hard registers (if 
possible)
but the target hook is always called as non-strict. This only affects MIPS16 
instructions with
not directly accessible $sp. The strict variant, as I understand, was used in 
the reload
pass to indicate if a pseudo-register has been allocated a hard register. 
Unless LRA
should be setting the strict/non-strict depending on whether a temporal 
elimination
to hard reg was successful or there is something else that I missed? 

 Easier as:

  if (TARGET_MIPS16
   TEST_HARD_REG_BIT (reg_class_contents[M16_REGS], hard_regno))
return 1;
  return 0;

Indeed. 

 +  M16F_REGS,/* mips16 + frame */

 Constraints are supposed to be operating on real registers, after
 elimination, so it seems odd to include a fake register.  What went
 wrong with just M16_REGS?

Only the stack pointer has been added to M16_REGS. A number of patterns need to 
accept
it otherwise LRA inserts a lot of reloads and the code size goes up by about 
10%. 
The change does have also a positive effect on reload but marginally.
frame meant to indicate inclusion of both the stack and hard frame pointers 
in the class
but perhaps I should name it differently to avoid confusion.

 +  SPILL_REGS,   /* All but $sp and call preserved regs 
 are in here */
...
 +  { 0x0003fffc, 0x, 0x, 0x, 0x, 0x 
 },   /* SPILL_REGS */\

 These two don't seem to match.  I think literally it would be 0x0300fffc,
 but maybe you had to make SPILL_REGS a superset of M16_REGs?

I initially used 0x0300fffc but did some experiments and it turned out that 
0x0003fffc (with $16, $17 regs)
gives slightly better code. I haven't updated the comment though. There is yet 
more to do 
and need to return to another thread with MIPS16 at some point as I found some 
limitations
of IRA/LRA to generate better code. $8-$15 are currently inaccessible as 
temporary storage
because these registers are marked as fixed (when optimizing for size) but 
leaving them
as fixed are better for the code size. I don't expect a big gain by using hard 
registers
for spilling but it more likely to improve the performance.

 +/* Add costs to hard registers based on frequency. This helps to negate
 +   some of the reduced cost associated with argument registers which 
 +   unfairly promotes their use and increases register pressure */
 +#define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(REGNO)   \
 +  (TARGET_MIPS16  optimize_size   \
 +   ? ((REGNO) = 4  (REGNO) = 7 ? 2 : 0) : 0)

 So we would be trying to use, say, $4 for the first incoming argument
 even after it had been spilled?  Hmm.

 Since this change is aimed specifically at one heuristic, I wonder
 whether we should parameterise that heuristic somehow rather than
 try to use a general hook to undo it.  But I don't think there's
 anything particularly special about MIPS16 here, so maybe it's too
 eager for all targets.

In a number of cases argument registers appeared to be unfairly promoted
increasing the register pressure and increasing the number of reloads.
Bumping up the cost of using those registers encourages IRA to spill
into memory but this appears to help LRA to do a better allocation. Of course,
not always it is a win but generally the gain outweighs the loss. 

I've seen an codesize improvements for other optimization levels
but I'm unsure whether we should make this change generic.
This part of the patch is not crucial though and can be send separately. 

  (define_insn *andmode3_mips16
 ...

 I think we want to keep the LWU case at the very least, since I assume
 otherwise we'll load the full 64-bit spill slot and use a pair of shifts
 to zero-extend it.  Reloading the stack address into a base register
 should always be better than that.

 I agree it's less clear for the byte and halfword cases.  All other
 things -- including size -- being equal, reloading a stack address into
 a base register and using an extending load should be better than
 reloading the full spill slot and extending it, since the reloaded stack
 address is more likely to be reused in other instructions.

 The

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-04-09 Thread Jakub Jelinek

On Wed, Apr 09, 2014 at 11:51:54AM +0200, Jakub Jelinek wrote:
 On Fri, Apr 04, 2014 at 10:38:49AM -0500, Bill Schmidt wrote:
  Thanks to everyone who helped with development, testing, and review of
  the patch set!  I've committed the changes to 4.8 this morning.  Note
  that patch 15/26 was rejected as not really germane to this series and
  has been submitted separately by Peter Bergner.
 
 While trying to merge this to redhat/gcc-4_8-branch, I've so far noticed
 that you have merged in the r199972 change (apparently without ChangeLog 
 entry),
 without r202642 change that reverted it later on.
 Can you please revert that one liner change?

Another issue is bad toplevel ChangeLog entries.
2014-04-04  Bill Schmidt  wschm...@linux.vnet.ibm.com

Backport from mainline
2013-11-22  Ulrich Weigand  ulrich.weig...@de.ibm.com

* libgo/config/libtool.m4: Update to mainline version.
* libgo/configure: Regenerate.

2013-11-17  Ulrich Weigand  ulrich.weig...@de.ibm.com

* libgo/config/libtool.m4: Update to mainline version.
* libgo/configure: Regenerate.

2013-11-15  Ulrich Weigand  ulrich.weig...@de.ibm.com

* libtool.m4: Update to mainline version.
* libjava/libltdl/acinclude.m4: Likewise.

* gcc/configure: Regenerate.
* boehm-gc/configure: Regenerate.
* libatomic/configure: Regenerate.
* libbacktrace/configure: Regenerate.
* libffi/configure: Regenerate.
* libgfortran/configure: Regenerate.
* libgomp/configure: Regenerate.
* libitm/configure: Regenerate.
* libjava/configure: Regenerate.
* libjava/libltdl/configure: Regenerate.
* libjava/classpath/configure: Regenerate.
* libmudflap/configure: Regenerate.
* libobjc/configure: Regenerate.
* libquadmath/configure: Regenerate.
* libsanitizer/configure: Regenerate.
* libssp/configure: Regenerate.
* libstdc++-v3/configure: Regenerate.
* lto-plugin/configure: Regenerate.
* zlib/configure: Regenerate.

Except for the libtool.m4 change, which is a toplevel change, all
those changes are to files in subdirectories which have their own ChangeLog
file (or in case of libjava/classpath ChangeLog.gcj), the ChangeLog entries
should go into those directories rather than the toplevel ChangeLog.

Jakub

RE: [PATCH v7?] PR middle-end/60281

2014-04-09 Thread Bernd Edlinger

Hi Lin,

thanks for clarifying this.

If you say you can't sign the FSF copyright assignment,
we can't use your patch, I'm afraid.

Well, I was curious how to proceed, because these unaligned
stm instructions are also a problem under linux.

The test cases don't fail, because the exception handler emulates
these instructions, but that is quite annoying, because
each time the test suite runs, there are many syslog
entries complaining about unaligned stm instructions.

Actually I had begun to work on a patch for this issue at
about the same time when you posted your patch.

As it looked like your patch was likely to be approved soon,
I decided to wait for your patch.

But now I think, maybe I should propose my alternative
patch instead, if you don't mind.


Regards
Bernd.

Re: [PATCH v7?] PR middle-end/60281

2014-04-09 Thread lin zuojian

Hi Bernd,
Seem we are not talking the same problem.You should first make sure
what has been going wrong first.
And I will sign it.
--
Regards
lin zuojian

RE: [PATCH v7?] PR middle-end/60281

2014-04-09 Thread Bernd Edlinger

Hi Lin,


 Seem we are not talking the same problem.You should first make sure
 what has been going wrong first.

Maybe I misunderstood your point.

 And I will sign it.
 --
 Regards
 lin zuojian


Ok, then please do it.

Once you have signed it, and got the approval by a global GCC reviewer,
I would be happy to assist you in committing that patch, if you like.

Regards
Bernd.

Re: [PATCH v7?] PR middle-end/60281

2014-04-09 Thread lin zuojian

Hi Bernd,
I am asking them if they would accept a scaned image version.Post
station is so 90's
--
Regards
lin zuojian

Re: [PATCH] Fix for PR libstdc++/60758

2014-04-09 Thread Ramana Radhakrishnan


On 04/09/14 09:07, Alexey Merzlyakov wrote:

On 04.04.2014 14:44, Alexey Merzlyakov wrote:

Hi all,

Here is a patch, that fixes infinite backtraces in __cxa_end_cleanup().
The Bugzilla entry for
this:http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60758

The __cxa_end_cleanup() does not save/restore LR in function
header/footer and does not provide any unwind info,


So, your patch saves / restores LR to allow the prologue parser in GDB 
to get this right and still doesn't provide unwind info. It would be 
better to make that work correctly as well while you are here by 
providing the appropriate cfi directives.



which causes problems with GDB and other tools (e.g. unwind code in
libgcc, libbacktrace, etc.).

Best regards,
Merzlyakov Alexey

2014-04-03  Alexey Merzlyakov alexey.merzlya...@samsung.com

 PR libstdc++/60758
 * libsupc++/eh_arm.cc (__cxa_end_cleanup): Add LR save/restore.

diff --git a/libstdc++-v3/libsupc++/eh_arm.cc
b/libstdc++-v3/libsupc++/eh_arm.cc
index aa453dd..ead1e61 100644
--- a/libstdc++-v3/libsupc++/eh_arm.cc
+++ b/libstdc++-v3/libsupc++/eh_arm.cc
@@ -206,9 +206,9 @@ asm (  .pushsection .text.__cxa_end_cleanup\n
  .type __cxa_end_cleanup, \function\\n
  .thumb_func\n
  __cxa_end_cleanup:\n
-push\t{r1, r2, r3, r4}\n
+push\t{r1, r2, r3, r4, lr}\n


So if you are doing that please replace r4 by lr ? r4 is a callee save 
register and is purely used here to keep stack alignment to 64 bits. Not 
doing that isn't ideal here even though things will work because 
__cxa_end_cleanup is part of this.



  bl\t__gnu_end_cleanup\n
-pop\t{r1, r2, r3, r4}\n
+pop\t{r1, r2, r3, r4, lr}\n
  bl\t_Unwind_Resume @ Never returns\n
  .popsection\n);
  #else
@@ -216,9 +216,9 @@ asm (  .pushsection .text.__cxa_end_cleanup\n
  .global __cxa_end_cleanup\n
  .type __cxa_end_cleanup, \function\\n
  __cxa_end_cleanup:\n
-stmfd\tsp!, {r1, r2, r3, r4}\n
+stmfd\tsp!, {r1, r2, r3, r4, lr}\n


and likewise.


  bl\t__gnu_end_cleanup\n
-ldmfd\tsp!, {r1, r2, r3, r4}\n
+ldmfd\tsp!, {r1, r2, r3, r4, lr}\n
  bl\t_Unwind_Resume @ Never returns\n
  .popsection\n);
  #endif



Forgot to mention:
the patch has been tested on ARM - no regressions.


And by that what do you mean ?

arm-eabi , arm-linux-gnueabi(hf) with / without Neon, ARM state / Thumb 
state ?




regards
Ramana



Best regards,
Merzlyakov Alexey




--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Eric Botcazou

 The only changes I've found are: (in the previously attached patch)
 (the other commits refer to
 2014-01-26: lynxos
 2014-01-24: android
 2014-01-20: linux
 2013-01-29 : vms
 and they are not related to the patches needing a revert.

OK, thanks for the clarification.  Let's try to find a middle ground though so 
that Linux/x32 is not totally broken.

I think that we definitely want to revert the s-osinte-posix.adb change, which 
is a blatant violation of POSIX.  Which means that Linux/x32 cannot use that 
file and we therefore need s-osinte-x32.adb, but the file is relatively small.

In order to avoid creating more x32-specific files, I think that we need to 
move the definition of 'struct timespec' and 'struct timeval' (both specified 
by POSIX) to s-linux.ads.  This requires with'ing Interfaces.C, but I think 
that's OK since s-linux.ads is a spin-off of s-osinte-linux.ads which also 
with'es Interfaces.C.

What do you think, Arno?  I think that the POSIX breakage (and its fallout for 
the other Unices) is ugly and worth the additional complication.


PR ada/54040
PR ada/59346
* s-osinte-x32.adb: New file.
* s-linux.ads (Time): New section.
* s-linux-alpha.ads (Time): Likewise.
* s-linux-android.ads (Time: Likewise.
* s-linux-hppa.ads (Time): Likewise.
* s-linux-mipsel.ads (Time): Likewise.
* s-linux-sparc.ads (Time): Likewise.
* s-linux-x32.ads (Time): Likewise.
* s-osinte-linux.ads (Time): Define local subtypes for those defined
in System.Linux.
* s-taprop-linux.adb (Monotonic_Clock): Do not define timeval.
* s-osinte-hpux.ads (timespec): Revert POSIX breakage.
* s-osinte-kfreebsd-gnu.ads (timespec): Likewise.
* s-osinte-solaris-posix.ads (timespec): Likewise.
* s-osinte-posix.adb (To_Timespec): Likewise.
* gcc-interface/Makefile.in (x32/Linux): Use s-osinte-x32.adb.


-- 
Eric BotcazouIndex: s-osinte-linux.ads
===
--- s-osinte-linux.ads	(revision 209236)
+++ s-osinte-linux.ads	(working copy)
@@ -7,7 +7,7 @@
 --  S p e c --
 --  --
 -- Copyright (C) 1991-1994, Florida State University--
---  Copyright (C) 1995-2013, Free Software Foundation, Inc. --
+--  Copyright (C) 1995-2014, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -217,8 +217,9 @@ package System.OS_Interface is
-- Time --
--
 
-   type timespec is private;
-   type time_t is private;
+   subtype time_t   is System.Linux.time_t;
+   subtype timespec is System.Linux.timespec;
+   subtype timeval  is System.Linux.timeval;
 
function To_Duration (TS : timespec) return Duration;
pragma Inline (To_Duration);
@@ -598,14 +599,6 @@ private
 
type pid_t is new int;
 
-   type time_t is new System.Linux.time_t;
-
-   type timespec is record
-  tv_sec  : time_t;
-  tv_nsec : time_t;
-   end record;
-   pragma Convention (C, timespec);
-
type unsigned_long_long_t is mod 2 ** 64;
--  Local type only used to get the alignment of this type below
 
Index: s-osinte-hpux.ads
===
--- s-osinte-hpux.ads	(revision 209236)
+++ s-osinte-hpux.ads	(working copy)
@@ -7,7 +7,7 @@
 --  S p e c --
 --  --
 --   Copyright (C) 1991-1994, Florida State University  --
---Copyright (C) 1995-2013, Free Software Foundation, Inc.   --
+--Copyright (C) 1995-2014, Free Software Foundation, Inc.   --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -522,7 +522,7 @@ private
 
type timespec is record
   tv_sec  : time_t;
-  tv_nsec : time_t;
+  tv_nsec : long;
end record;
pragma Convention (C, timespec);
 
Index: s-linux-android.ads
===
--- s-linux-android.ads	(revision 209236)
+++ s-linux-android.ads	(working copy)
@@ -35,14 +35,30 @@
 --  PLEASE DO NOT add any with-clauses to this package or remove the pragma
 --  Preelaborate. This package is designed to be a bottom-level (leaf) package
 
+with Interfaces.C;
+
 package System.Linux is
pragma Preelaborate;
 
-   
-   --

Re: Fix PR60644

2014-04-09 Thread Alexander Ivchenko

ping..

2014-04-04 15:28 GMT+04:00 Alexander Ivchenko aivch...@gmail.com:
 2014-04-04 14:19 GMT+04:00 Richard Biener richard.guent...@gmail.com:
 On Fri, Apr 4, 2014 at 12:03 PM, Alexander Ivchenko aivch...@gmail.com 
 wrote:
 *ping*

 I wonder whether this is consistend between compilers (note GCC is not
 upstream here?).  So eventually all places should be ANDROID || __ANDROID__?

 I checked that gcc-4.[678], llvm (trunk) and icc (14)  all have
 __ANDROID__. If I understood your question correctly..
 I don't see any reasons to check ANDROID macros during the build of 
 libcilkrts.

 2014-03-27 13:43 GMT+04:00 Alexander Ivchenko aivch...@gmail.com:
 Adding Balaji.

 --Alexander

 2014-03-26 18:56 GMT+04:00 Alexander Ivchenko aivch...@gmail.com:
 Hi,

 In gcc/config/linux-android.h we have builtin_define (__ANDROID__);
 So ANDROID as in libcilkrts now is not the correct macro to check.

 Bootstrapped and passed cilk testsuite on x86_64-unknown-linux-gnu.

 diff --git a/libcilkrts/ChangeLog b/libcilkrts/ChangeLog
 index eb0d6ec..65efef0 100644
 --- a/libcilkrts/ChangeLog
 +++ b/libcilkrts/ChangeLog
 @@ -1,3 +1,12 @@
 +2014-03-26  Alexander Ivchenko  alexander.ivche...@intel.com
 +
 + PR bootstrap/60644
 +
 + * include/cilk/metaprogramming.h: Change ANDROID to __ANDROID__.
 + * include/cilk/reducer_min_max.h: Ditto.
 + * runtime/bug.h: Ditto.
 + * runtime/os-unix.c: Ditto.
 +
  2014-03-20  Tobias Burnus  bur...@net-b.de

   PR other/60589
 diff --git a/libcilkrts/include/cilk/metaprogramming.h
 b/libcilkrts/include/cilk/metaprogramming.h
 index 5f6f29d..29b0839 100644
 --- a/libcilkrts/include/cilk/metaprogramming.h
 +++ b/libcilkrts/include/cilk/metaprogramming.h
 @@ -468,7 +468,7 @@ inline void* allocate_aligned(std::size_t size,
 std::size_t alignment)
  #ifdef _WIN32
  return _aligned_malloc(size, alignment);
  #else
 -#if defined(ANDROID) || defined(__ANDROID__)
 +#if defined(__ANDROID__)
  return memalign(std::max(alignment, sizeof(void*)), size);
  #else
  void* ptr;
 diff --git a/libcilkrts/include/cilk/reducer_min_max.h
 b/libcilkrts/include/cilk/reducer_min_max.h
 index 55f068c..7fe09e8 100644
 --- a/libcilkrts/include/cilk/reducer_min_max.h
 +++ b/libcilkrts/include/cilk/reducer_min_max.h
 @@ -3025,7 +3025,7 @@ struct legacy_reducer_downcast reducer
 op_min_indexIndex, Type, Compare, Alig
  #include limits.h

  /* Wchar_t min/max constants */
 -#if defined(_MSC_VER) || defined(ANDROID)
 +#if defined(_MSC_VER) || defined(__ANDROID__)
  #   include wchar.h
  #else
  #   include stdint.h
 diff --git a/libcilkrts/runtime/bug.h b/libcilkrts/runtime/bug.h
 index bb18913..1a64bea 100644
 --- a/libcilkrts/runtime/bug.h
 +++ b/libcilkrts/runtime/bug.h
 @@ -90,7 +90,7 @@ COMMON_PORTABLE extern const char *const
 __cilkrts_assertion_failed;
   * GPL V3 licensed.
   */
  COMMON_PORTABLE void cilkbug_assert_no_uncaught_exception(void);
 -#if defined(_WIN32) || defined(ANDROID)
 +#if defined(_WIN32) || defined(__ANDROID__)
  #  define CILKBUG_ASSERT_NO_UNCAUGHT_EXCEPTION()
  #else
  #  define CILKBUG_ASSERT_NO_UNCAUGHT_EXCEPTION() \
 diff --git a/libcilkrts/runtime/os-unix.c b/libcilkrts/runtime/os-unix.c
 index fafb91d..85bc08d 100644
 --- a/libcilkrts/runtime/os-unix.c
 +++ b/libcilkrts/runtime/os-unix.c
 @@ -282,7 +282,7 @@ void __cilkrts_init_tls_variables(void)
  }
  #endif

 -#if defined (__linux__)  ! defined(ANDROID)
 +#if defined (__linux__)  ! defined(__ANDROID__)
  /*
   * Get the thread id, rather than the pid. In the case of MIC offload, 
 it's
   * possible that we have multiple threads entering Cilk, and each has a
 @@ -354,7 +354,7 @@ static int linux_get_affinity_count (int tid)

  COMMON_SYSDEP int __cilkrts_hardware_cpu_count(void)
  {
 -#if defined ANDROID || (defined(__sun__)  defined(__svr4__))
 +#if defined __ANDROID__ || (defined(__sun__)  defined(__svr4__))
  return sysconf (_SC_NPROCESSORS_ONLN);
  #elif defined __MIC__
  /// HACK: Usually, the 3rd and 4th hyperthreads are not beneficial
 @@ -409,7 +409,7 @@ COMMON_SYSDEP void __cilkrts_yield(void)
  // giving up the processor and latency starting up when work becomes
  // available
  _mm_delay_32(1024);
 -#elif defined(ANDROID) || (defined(__sun__)  defined(__svr4__))
 +#elif defined(__ANDROID__) || (defined(__sun__)  defined(__svr4__))
  // On Android and Solaris, call sched_yield to yield quantum.  I'm 
 not
  // sure why we don't do this on Linux also.
  sched_yield();




 Is it OK?

 --Alexander

Re: Fix PR60644

2014-04-09 Thread Jakub Jelinek

On Wed, Apr 09, 2014 at 03:46:13PM +0400, Alexander Ivchenko wrote:
 ping..

I guess it really depends on whether the libcilkrts sources are going to be
(semi?)regularly imported from some upstream repository or not, and if the
upstream is willing to accept these changes.

The alternative is to modify libcilkrts/configure.ac and/or
libcilkrts/Makefile.am, so that on Android target -DANDROID is passed to
libcilkrts compilation, if libcilkrts upstream wouldn't be willing to accept
the change.

So, we really need feedback from Balaji on this.

Jakub

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Arnaud Charlet

 What do you think, Arno?  I think that the POSIX breakage (and its fallout for
 the other Unices) is ugly and worth the additional complication.

Yes, your patch looks good to me.

Arno

Patch ping

2014-04-09 Thread Jakub Jelinek

Hi!

I'd like to ping:

- http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01370.html
  PR sanitizer/56781
  fix --with-build-config=bootstrap-ubsan bootstrap of lto-plugin

- http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01433.html
  PR sanitizer/56781
  fix --with-build-config=bootstrap-asan bootstrap of lto-plugin

Thanks

Jakub

RE: Fix PR60644

2014-04-09 Thread Iyer, Balaji V

 -Original Message-
 From: Jakub Jelinek [mailto:ja...@redhat.com]
 Sent: Wednesday, April 9, 2014 8:06 AM
 To: Alexander Ivchenko
 Cc: Richard Biener; GCC Patches; Iyer, Balaji V
 Subject: Re: Fix PR60644

 On Wed, Apr 09, 2014 at 03:46:13PM +0400, Alexander Ivchenko wrote:
  ping..

 I guess it really depends on whether the libcilkrts sources are going to be
 (semi?)regularly imported from some upstream repository or not, and if the
 upstream is willing to accept these changes.

Yes.

Re: Fix PR60644

2014-04-09 Thread Jakub Jelinek

On Wed, Apr 09, 2014 at 01:23:59PM +, Iyer, Balaji V wrote:

  -Original Message-
  From: Jakub Jelinek [mailto:ja...@redhat.com]
  Sent: Wednesday, April 9, 2014 8:06 AM
  To: Alexander Ivchenko
  Cc: Richard Biener; GCC Patches; Iyer, Balaji V
  Subject: Re: Fix PR60644

  On Wed, Apr 09, 2014 at 03:46:13PM +0400, Alexander Ivchenko wrote:
   ping..

  I guess it really depends on whether the libcilkrts sources are going to be
  (semi?)regularly imported from some upstream repository or not, and if the
  upstream is willing to accept these changes.

 Yes. 

So, are you ok with the changes and will you handle propagating them
upstream once they are committed to GCC?
If yes, the patch is preapproved.

Jakub

Re: Fix PR60644

2014-04-09 Thread Alexander Ivchenko

The changes are consistent with what is currently in upstream. So
there is no additional work required

2014-04-09 17:31 GMT+04:00 Iyer, Balaji V balaji.v.i...@intel.com:

 -Original Message-
 From: Jakub Jelinek [mailto:ja...@redhat.com]
 Sent: Wednesday, April 9, 2014 9:29 AM
 To: Iyer, Balaji V
 Cc: Alexander Ivchenko; Richard Biener; GCC Patches
 Subject: Re: Fix PR60644

 On Wed, Apr 09, 2014 at 01:23:59PM +, Iyer, Balaji V wrote:

   -Original Message-
   From: Jakub Jelinek [mailto:ja...@redhat.com]
   Sent: Wednesday, April 9, 2014 8:06 AM
   To: Alexander Ivchenko
   Cc: Richard Biener; GCC Patches; Iyer, Balaji V
   Subject: Re: Fix PR60644

   On Wed, Apr 09, 2014 at 03:46:13PM +0400, Alexander Ivchenko wrote:
ping..

   I guess it really depends on whether the libcilkrts sources are
   going to be (semi?)regularly imported from some upstream repository
   or not, and if the upstream is willing to accept these changes.

  Yes.

 So, are you ok with the changes and will you handle propagating them
 upstream once they are committed to GCC?
 If yes, the patch is preapproved.

 Yes, I am Ok with the changes. We will check the patch in soon that will fix 
 this error.

   Jakub

[PATCH] Prevent out of bound access for multilib_options

2014-04-09 Thread Kito Cheng

`q` will out of bound access if `*q` already reach the end of
multilib_options, so check it before increment to prevent condition
check part out of bound access.

btw, this bug is detected by address sanitizer.


2014-04-09  Kito Cheng  k...@0xlab.org
* gcc.c (used_arg): Prevent out of bound access for multilib_options.

diff --git a/gcc/gcc.c b/gcc/gcc.c
index 5cb485a..c8ab7d6 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -7490,7 +7490,7 @@ used_arg (const char *p, int len)
{
  const char *r;

- for (q = multilib_options; *q != '\0'; q++)
+ for (q = multilib_options; *q != '\0'; *q  q++)
{
  while (*q == ' ')
q++;

Re: [PATCH] Prevent out of bound access for multilib_options

2014-04-09 Thread Jakub Jelinek

On Wed, Apr 09, 2014 at 10:00:38PM +0800, Kito Cheng wrote:
 `q` will out of bound access if `*q` already reach the end of
 multilib_options, so check it before increment to prevent condition
 check part out of bound access.
 
 btw, this bug is detected by address sanitizer.

Can you please expand on which target it is and what multilib_options
contains?  Perhaps some target just has invalid string in there.

 2014-04-09  Kito Cheng  k...@0xlab.org
 * gcc.c (used_arg): Prevent out of bound access for multilib_options.
 
 diff --git a/gcc/gcc.c b/gcc/gcc.c
 index 5cb485a..c8ab7d6 100644
 --- a/gcc/gcc.c
 +++ b/gcc/gcc.c
 @@ -7490,7 +7490,7 @@ used_arg (const char *p, int len)
 {
   const char *r;
 
 - for (q = multilib_options; *q != '\0'; q++)
 + for (q = multilib_options; *q != '\0'; *q  q++)
 {
   while (*q == ' ')
 q++;

Jakub

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Svante Signell

On Wed, 2014-04-09 at 14:41 +0200, Arnaud Charlet wrote:
  What do you think, Arno?  I think that the POSIX breakage (and its fallout 
  for
  the other Unices) is ugly and worth the additional complication.
 
 Yes, your patch looks good to me.

Would it be possible to have s-osinte-posix.adb also for x32 and in
s-osinte-x32.ads use the following construct:
...
   type timespec is private;
...
   type timespec is record
  tv_sec  : time_t;
  tv_nsec : long log;
   end record;
   pragma Convention (C, timespec);

and similiar for timeval if needed?

That's the construct other unices use now when s-osinte-posix.adb
defines tv_nsec as time_t?

Re: [PATCH] Prevent out of bound access for multilib_options

2014-04-09 Thread Kito Cheng

for example: arm-elf-eabi in trunk, multilib_options = marm/mthumb
mfloat-abi=hard

and it's my configure options:
/home/kito/gcc/gcc-src/configure
--prefix=/home/kito/gcc-workspace/arm-eabi --target=arm-elf-eabi
CFLAGS=-fsanitize=address -g CXXFLAGS=-fsanitize=address -g
LDFLAGS=-fsanitize=address -g

$  bin/arm-elf-eabi-gcc -v
Using built-in specs.
=
==26436== ERROR: AddressSanitizer: global-buffer-overflow on address
0x0051f7dc at pc 0x425b42 bp 0x7fffbb84f890 sp 0x7fffbb84f888
READ of size 1 at 0x0051f7dc thread T0
#0 0x425b41
(/home/kito/gcc-workspace/arm-eabi/bin/arm-elf-eabi-gcc+0x425b41)
#1 0x426d28
(/home/kito/gcc-workspace/arm-eabi/bin/arm-elf-eabi-gcc+0x426d28)
#2 0x420b5e
(/home/kito/gcc-workspace/arm-eabi/bin/arm-elf-eabi-gcc+0x420b5e)
#3 0x31b3421b44 (/usr/lib64/libc-2.17.so+0x21b44)
#4 0x4032b8
(/home/kito/gcc-workspace/arm-eabi/bin/arm-elf-eabi-gcc+0x4032b8)
0x0051f7dc is located 36 bytes to the left of global variable
'*.LC2 (/home/kito/gcc/gcc-src/gcc/gcc.c)' (0x51f800) of size 13
  '*.LC2 (/home/kito/gcc/gcc-src/gcc/gcc.c)' is ascii string 'arm-elf-eabi'
0x0051f7dc is located 0 bytes to the right of global variable
'*.LC1 (/home/kito/gcc/gcc-src/gcc/gcc.c)' (0x51f7c0) of size 28
  '*.LC1 (/home/kito/gcc/gcc-src/gcc/gcc.c)' is ascii string
'marm/mthumb mfloat-abi=hard'
Shadow bytes around the buggy address:
  0x8009bea0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x8009beb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x8009bec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x8009bed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x8009bee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=0x8009bef0: 01 f9 f9 f9 f9 f9 f9 f9 00 00 00[04]f9 f9 f9 f9
  0x8009bf00: 00 05 f9 f9 f9 f9 f9 f9 02 f9 f9 f9 f9 f9 f9 f9
  0x8009bf10: 00 00 00 00 00 00 00 00 00 00 00 00 03 f9 f9 f9
  0x8009bf20: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x8009bf30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x8009bf40: 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00 f9 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone: fa
  Heap righ redzone: fb
  Freed Heap region: fd
  Stack left redzone:f1
  Stack mid redzone: f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:f5
  Stack use after scope: f8
  Global redzone:f9
  Global init order: f6
  Poisoned by user:  f7
  ASan internal: fe
==26436== ABORTING

On Wed, Apr 9, 2014 at 10:03 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Apr 09, 2014 at 10:00:38PM +0800, Kito Cheng wrote:
 `q` will out of bound access if `*q` already reach the end of
 multilib_options, so check it before increment to prevent condition
 check part out of bound access.

 btw, this bug is detected by address sanitizer.

 Can you please expand on which target it is and what multilib_options
 contains?  Perhaps some target just has invalid string in there.

 2014-04-09  Kito Cheng  k...@0xlab.org
 * gcc.c (used_arg): Prevent out of bound access for multilib_options.

 diff --git a/gcc/gcc.c b/gcc/gcc.c
 index 5cb485a..c8ab7d6 100644
 --- a/gcc/gcc.c
 +++ b/gcc/gcc.c
 @@ -7490,7 +7490,7 @@ used_arg (const char *p, int len)
 {
   const char *r;

 - for (q = multilib_options; *q != '\0'; q++)
 + for (q = multilib_options; *q != '\0'; *q  q++)
 {
   while (*q == ' ')
 q++;

 Jakub

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Eric Botcazou

 Would it be possible to have s-osinte-posix.adb also for x32 and in
 s-osinte-x32.ads use the following construct:
 ...
type timespec is private;
 ...
type timespec is record
   tv_sec  : time_t;
   tv_nsec : long log;
end record;
pragma Convention (C, timespec);
 
 and similiar for timeval if needed?
 
 That's the construct other unices use now when s-osinte-posix.adb
 defines tv_nsec as time_t?

Not sure what the now is referring to, but if you want to revert the 
original POSIX breakage in s-osinte-posix.adb, you need to define timespec 
according to the POSIX spec, there is no other way.

-- 
Eric Botcazou

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Eric Botcazou

 Yes, your patch looks good to me.

Thanks, now applied.  I'll make sure everything is resynced with it.

-- 
Eric Botcazou

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Svante Signell

On Wed, 2014-04-09 at 16:24 +0200, Eric Botcazou wrote:
  Would it be possible to have s-osinte-posix.adb also for x32 and in
  s-osinte-x32.ads use the following construct:
  ...
 type timespec is private;
  ...
 type timespec is record
tv_sec  : time_t;
tv_nsec : long long;
 end record;
 pragma Convention (C, timespec);
  
  and similiar for timeval if needed?
  
  That's the construct other unices use now when s-osinte-posix.adb
  defines tv_nsec as time_t?
 
 Not sure what the now is referring to, but if you want to revert the 
 original POSIX breakage in s-osinte-posix.adb, you need to define timespec 
 according to the POSIX spec, there is no other way.

I thought of creating a new type for x32: 

type nanosec_t is private; 
type nanosec_t is new long long;

  type timespec is record
  tv_sec  : time_t;
  tv_nsec : nanosec_t;
   end record;

Having the correct definition of tv_nsec in s-osinte-posix.adb
   type timespec is record
  tv_sec  : time_t;
  tv_nsec : long;
   end record;
   pragma Convention (C, timespec);

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-04-09 Thread Bill Schmidt

On Wed, 2014-04-09 at 11:51 +0200, Jakub Jelinek wrote:
 On Fri, Apr 04, 2014 at 10:38:49AM -0500, Bill Schmidt wrote:
  Thanks to everyone who helped with development, testing, and review of
  the patch set!  I've committed the changes to 4.8 this morning.  Note
  that patch 15/26 was rejected as not really germane to this series and
  has been submitted separately by Peter Bergner.
 
 While trying to merge this to redhat/gcc-4_8-branch, I've so far noticed
 that you have merged in the r199972 change (apparently without ChangeLog 
 entry),
 without r202642 change that reverted it later on.
 Can you please revert that one liner change?

Hm, yes.  Sorry for the oversight!  Testing the revert now and will
check it in shortly.

Thanks,
Bill
 
   Jakub

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-04-09 Thread Bill Schmidt

On Wed, 2014-04-09 at 12:03 +0200, Jakub Jelinek wrote:
 Another issue is bad toplevel ChangeLog entries.
 2014-04-04  Bill Schmidt  wschm...@linux.vnet.ibm.com
 
   Backport from mainline
   2013-11-22  Ulrich Weigand  ulrich.weig...@de.ibm.com
 
   * libgo/config/libtool.m4: Update to mainline version.
   * libgo/configure: Regenerate.
 
   2013-11-17  Ulrich Weigand  ulrich.weig...@de.ibm.com
 
   * libgo/config/libtool.m4: Update to mainline version.
   * libgo/configure: Regenerate.
 
   2013-11-15  Ulrich Weigand  ulrich.weig...@de.ibm.com
 
   * libtool.m4: Update to mainline version.
   * libjava/libltdl/acinclude.m4: Likewise.
 
   * gcc/configure: Regenerate.
   * boehm-gc/configure: Regenerate.
   * libatomic/configure: Regenerate.
   * libbacktrace/configure: Regenerate.
   * libffi/configure: Regenerate.
   * libgfortran/configure: Regenerate.
   * libgomp/configure: Regenerate.
   * libitm/configure: Regenerate.
   * libjava/configure: Regenerate.
   * libjava/libltdl/configure: Regenerate.
   * libjava/classpath/configure: Regenerate.
   * libmudflap/configure: Regenerate.
   * libobjc/configure: Regenerate.
   * libquadmath/configure: Regenerate.
   * libsanitizer/configure: Regenerate.
   * libssp/configure: Regenerate.
   * libstdc++-v3/configure: Regenerate.
   * lto-plugin/configure: Regenerate.
   * zlib/configure: Regenerate.
 
 Except for the libtool.m4 change, which is a toplevel change, all
 those changes are to files in subdirectories which have their own ChangeLog
 file (or in case of libjava/classpath ChangeLog.gcj), the ChangeLog entries
 should go into those directories rather than the toplevel ChangeLog.

Ah, right.  I had meant to fix this before committing the patch set and
dropped the ball.

One question:  Where are ChangeLog entries supposed to go for libgo?
There doesn't seem to be any kind of ChangeLog file for that component.

Thanks,
Bill

 
   Jakub

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-04-09 Thread Jakub Jelinek

On Wed, Apr 09, 2014 at 10:27:33AM -0500, Bill Schmidt wrote:
 Ah, right.  I had meant to fix this before committing the patch set and
 dropped the ball.

Thanks.

 One question:  Where are ChangeLog entries supposed to go for libgo?
 There doesn't seem to be any kind of ChangeLog file for that component.

Probably nowhere.

Jakub

Re: [PATCH][C++] Fix PR60761, diagnostics in clones

2014-04-09 Thread Jakub Jelinek

On Wed, Apr 09, 2014 at 11:37:35AM +0200, Martin Jambor wrote:
 I think you should use DECL_ABSTRACT_ORIGIN instead of
 former_clone_of.  Not only you avoid using cgraph stuff here but
 unlike this patch, it also works for IPA-CP clones of IPA-SRA clones
 (yeah, I know, but I bet I can cause the same havoc by ipa-split
 instead of ipa-sra, just not as easily).

But with DECL_ABSTRACT_ORIGIN only (I guess you still mean only if
!DECL_LANG_SPECIFIC), is it always desirable to print clone after
the name?  I mean, DECL_ABSTRACT_ORIGIN is also set for inlines,
constructors/destructors, fnsplit etc.

Jakub

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread H.J. Lu

On Wed, Apr 9, 2014 at 7:55 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Yes, your patch looks good to me.

 Thanks, now applied.  I'll make sure everything is resynced with it.


I got

/export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/xgcc
-B/export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/
-B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/bin/
-B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/lib/ -isystem
/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/include -isystem
/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/sys-include-c -g -O2
-mx32 -fpic  -W -Wall -gnatpg -nostdinc -mx32  s-osprim.adb -o
s-osprim.o
s-osprim.adb:121:30: expected type Standard.Long_Long_Integer
s-osprim.adb:121:30: found type System.Os_Primitives.time_t
make[11]: *** [s-osprim.o] Error 1

-- 
H.J.

Re: [PATCH] Prevent out of bound access for multilib_options

2014-04-09 Thread Kito Cheng

More detail for arm-elf-eabi :)

After first iteration at  gcc.c:7493-7534, r = q = mfloat-abi=hard
at gcc.c:7498
then continue scan multilib_options at gcc.c:7499-7507,
and then `q` already reach the end of `multilib_options` which mean
`q` == multilib_options + strlen(multilib_options)
so the `if` is not taken at gcc.c:7509
next, `q++` at gcc.c:7493, it's now `q` == multilib_options +
strlen(multilib_options) + 1!!!
and finally access `*q` for check `*q` != '\0', out of bound access.

On Wed, Apr 9, 2014 at 10:21 PM, Kito Cheng kito.ch...@gmail.com wrote:
 for example: arm-elf-eabi in trunk, multilib_options = marm/mthumb
 mfloat-abi=hard

 and it's my configure options:
 /home/kito/gcc/gcc-src/configure
 --prefix=/home/kito/gcc-workspace/arm-eabi --target=arm-elf-eabi
 CFLAGS=-fsanitize=address -g CXXFLAGS=-fsanitize=address -g
 LDFLAGS=-fsanitize=address -g

 $  bin/arm-elf-eabi-gcc -v
 Using built-in specs.
 =
 ==26436== ERROR: AddressSanitizer: global-buffer-overflow on address
 0x0051f7dc at pc 0x425b42 bp 0x7fffbb84f890 sp 0x7fffbb84f888
 READ of size 1 at 0x0051f7dc thread T0
 #0 0x425b41
 (/home/kito/gcc-workspace/arm-eabi/bin/arm-elf-eabi-gcc+0x425b41)
 #1 0x426d28
 (/home/kito/gcc-workspace/arm-eabi/bin/arm-elf-eabi-gcc+0x426d28)
 #2 0x420b5e
 (/home/kito/gcc-workspace/arm-eabi/bin/arm-elf-eabi-gcc+0x420b5e)
 #3 0x31b3421b44 (/usr/lib64/libc-2.17.so+0x21b44)
 #4 0x4032b8
 (/home/kito/gcc-workspace/arm-eabi/bin/arm-elf-eabi-gcc+0x4032b8)
 0x0051f7dc is located 36 bytes to the left of global variable
 '*.LC2 (/home/kito/gcc/gcc-src/gcc/gcc.c)' (0x51f800) of size 13
   '*.LC2 (/home/kito/gcc/gcc-src/gcc/gcc.c)' is ascii string 'arm-elf-eabi'
 0x0051f7dc is located 0 bytes to the right of global variable
 '*.LC1 (/home/kito/gcc/gcc-src/gcc/gcc.c)' (0x51f7c0) of size 28
   '*.LC1 (/home/kito/gcc/gcc-src/gcc/gcc.c)' is ascii string
 'marm/mthumb mfloat-abi=hard'
 Shadow bytes around the buggy address:
   0x8009bea0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   0x8009beb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   0x8009bec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   0x8009bed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   0x8009bee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 =0x8009bef0: 01 f9 f9 f9 f9 f9 f9 f9 00 00 00[04]f9 f9 f9 f9
   0x8009bf00: 00 05 f9 f9 f9 f9 f9 f9 02 f9 f9 f9 f9 f9 f9 f9
   0x8009bf10: 00 00 00 00 00 00 00 00 00 00 00 00 03 f9 f9 f9
   0x8009bf20: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
   0x8009bf30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   0x8009bf40: 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00 f9 f9 f9 f9
 Shadow byte legend (one shadow byte represents 8 application bytes):
   Addressable:   00
   Partially addressable: 01 02 03 04 05 06 07
   Heap left redzone: fa
   Heap righ redzone: fb
   Freed Heap region: fd
   Stack left redzone:f1
   Stack mid redzone: f2
   Stack right redzone:   f3
   Stack partial redzone: f4
   Stack after return:f5
   Stack use after scope: f8
   Global redzone:f9
   Global init order: f6
   Poisoned by user:  f7
   ASan internal: fe
 ==26436== ABORTING

 On Wed, Apr 9, 2014 at 10:03 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Apr 09, 2014 at 10:00:38PM +0800, Kito Cheng wrote:
 `q` will out of bound access if `*q` already reach the end of
 multilib_options, so check it before increment to prevent condition
 check part out of bound access.

 btw, this bug is detected by address sanitizer.

 Can you please expand on which target it is and what multilib_options
 contains?  Perhaps some target just has invalid string in there.

 2014-04-09  Kito Cheng  k...@0xlab.org
 * gcc.c (used_arg): Prevent out of bound access for multilib_options.

 diff --git a/gcc/gcc.c b/gcc/gcc.c
 index 5cb485a..c8ab7d6 100644
 --- a/gcc/gcc.c
 +++ b/gcc/gcc.c
 @@ -7490,7 +7490,7 @@ used_arg (const char *p, int len)
 {
   const char *r;

 - for (q = multilib_options; *q != '\0'; q++)
 + for (q = multilib_options; *q != '\0'; *q  q++)
 {
   while (*q == ' ')
 q++;

 Jakub

Re: [PATCH] Prevent out of bound access for multilib_options

2014-04-09 Thread Graham Stott

All,

It happens with all mulitilib configurations not just arm-elf

if we have reached the end of the multilib_options string there are

no more options to process  so break from the loop.

This patch is an alternative fix.
=
Index: gcc.c

--- gcc.c    (revision 209248)
+++ gcc.c    (working copy)
@@ -7531,6 +7531,9 @@
         }
       break;
     }
+
+      if (*q == '\0')
+        break;
     }
 }
 }




Graham

Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode

2014-04-09 Thread Bill Schmidt

On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote:
 On 03/25/2014 11:20 AM, Pat Haugen wrote:
  Power8 can use lq/stq instructions for TI mode atomic_load/store. 
  Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once 
  bootstrap/regtest finishes)?
 
  -Pat
 
 
  2014-03-25  Pat Haugen pthau...@us.ibm.com
 
  * config/rs6000/sync.md (AINT mode_iterator): Move definition.
  (loadsync_mode): Change mode.
  (atomic_loadmode, atomic_storemode): Add support for TI mode.
  (load_quadpti, store_quadpti): New.
  * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
 
  gcc/testsuite:
  * gcc.target/powerpc/atomic_load_store-p8.c: New.
 
 Updated patch which was approved off list and I have committed.
 

Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8:

checking for suffix of executables... /home/wschmidt/gcc/gcc-4_8-base/libatomic\
/load_n.c: In function 'libat_load_16':
/home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid failur\
e memory model for '__atomic_compare_exchange'
 atomic_compare_exchange_n (mptr, t, 0, true,
   ^
make[4]: *** [load_16_.lo] Error 1
make[4]: *** Waiting for unfinished jobs

Thanks,
Bill

Re: [PATCH][C++] Fix PR60761, diagnostics in clones

2014-04-09 Thread Jason Merrill


On 04/09/2014 04:21 AM, Richard Biener wrote:

The names of the in-charge and not-in-charge constructor clones are
complete_ctor_identifier and base_ctor_identifier (and dtor for
destructors); you could check for those.


I was more asking for how we present those To the user in diagnostics. I wanted to 
use a consistent 'quoting' style. If using clone is fine then I'll just stick 
to that.


I think saying complete and base would be helpful for distinguishing 
them.


Jason

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-04-09 Thread Bill Schmidt

Cleaned up as r209249.

On Wed, 2014-04-09 at 17:28 +0200, Jakub Jelinek wrote:
 On Wed, Apr 09, 2014 at 10:27:33AM -0500, Bill Schmidt wrote:
  Ah, right.  I had meant to fix this before committing the patch set and
  dropped the ball.
 
 Thanks.
 
  One question:  Where are ChangeLog entries supposed to go for libgo?
  There doesn't seem to be any kind of ChangeLog file for that component.
 
 Probably nowhere.
 
   Jakub

Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-04-09 Thread Bill Schmidt

Cleaned up as r209250.

On Wed, 2014-04-09 at 11:51 +0200, Jakub Jelinek wrote:
 On Fri, Apr 04, 2014 at 10:38:49AM -0500, Bill Schmidt wrote:
  Thanks to everyone who helped with development, testing, and review of
  the patch set!  I've committed the changes to 4.8 this morning.  Note
  that patch 15/26 was rejected as not really germane to this series and
  has been submitted separately by Peter Bergner.
 
 While trying to merge this to redhat/gcc-4_8-branch, I've so far noticed
 that you have merged in the r199972 change (apparently without ChangeLog 
 entry),
 without r202642 change that reverted it later on.
 Can you please revert that one liner change?
 
   Jakub

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Ludovic Brenta

Eric Botcazou ebotca...@adacore.com writes:
 In order to avoid creating more x32-specific files, I think that we
 need to move the definition of 'struct timespec' and 'struct timeval'
 (both specified by POSIX) to s-linux.ads.  This requires with'ing
 Interfaces.C, but I think that's OK since s-linux.ads is a spin-off of
 s-osinte-linux.ads which also with'es Interfaces.C.

In my worthless opinion, it is a mistake to declare POSIX data types in
s-linux.ads, they should be in s-posix.ads or similar (don't worry if
that's a new file; and it should not be a leaf package).  Think of
GNU/kFreeBSD and GNU/Hurd, which have nothing to do with Linux.
Furthermore there should be only one declaration of type timespec
(i.e. do not repeat yourself); that declaration should be in
s-posix.ads and that declaration should violate POSIX like so:

with System.OS_Interface;
package System.POSIX is

type timespec is record
   tv_sec  : time_t;
   tv_nsec : System.OS_Interface.Nanoseconds_T; -- instead of long
end record;
pragma Convention (C, timespec);

end System.POSIX;

Each platform-specific version of System.OS_Interface should then
declare their own type Nanoseconds_T.  The version for x32 would declare

   type Nanoseconds_T is new Long_Long_Integer;
   -- or perhaps range -2**63 .. 2**63-1 to be more explicit?

thereby really violating POSIX but all others would declare

   type Nanoseconds_T is new Interfaces.C.long;

thereby restoring compliance with POSIX.

I'm really sorry that I don't have the time to propose a proper patch
but if someone does, I'd be happy to review it.

-- 
Ludovic Brenta.

Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode

2014-04-09 Thread David Edelsohn

I have reverted this on trunk and asked Bill to revert this on the 4.8
branch. This patch is too risky to apply this close to a freeze for
4.9.

Sorry for the problems.

- David


On Wed, Apr 9, 2014 at 2:56 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote:
 On 03/25/2014 11:20 AM, Pat Haugen wrote:
  Power8 can use lq/stq instructions for TI mode atomic_load/store.
  Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once
  bootstrap/regtest finishes)?
 
  -Pat
 
 
  2014-03-25  Pat Haugen pthau...@us.ibm.com
 
  * config/rs6000/sync.md (AINT mode_iterator): Move definition.
  (loadsync_mode): Change mode.
  (atomic_loadmode, atomic_storemode): Add support for TI mode.
  (load_quadpti, store_quadpti): New.
  * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
 
  gcc/testsuite:
  * gcc.target/powerpc/atomic_load_store-p8.c: New.

 Updated patch which was approved off list and I have committed.


 Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8:

 checking for suffix of executables... 
 /home/wschmidt/gcc/gcc-4_8-base/libatomic\
 /load_n.c: In function 'libat_load_16':
 /home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid 
 failur\
 e memory model for '__atomic_compare_exchange'
  atomic_compare_exchange_n (mptr, t, 0, true,
^
 make[4]: *** [load_16_.lo] Error 1
 make[4]: *** Waiting for unfinished jobs

 Thanks,
 Bill

[VRP][PATCH] Improve value range for loop index

2014-04-09 Thread Kugan

Value range propagation simplifies convergence in vrp_visit_phi_node by
setting minimum to TYPE_MIN when the computed minimum is smaller than
the previous minimum. This can however result in pessimistic value
ranges in some cases.

for example,

unsigned int i;
for (i = 0; i  8; i++)
{
  
}

# ivtmp_19 = PHI ivtmp_17(5), 8(2)
...
bb 5:
ivtmp_17 = ivtmp_19 - 1;
if (ivtmp_17 != 0)

goto bb 5;

min value of ivtmp_19  is simplified to 0 (in tree-vrp.c:8465) where as
it should have been 1. This prevents correct value ranges being
calculated for ivtmp_17 in the example.

We should be able to see the step (the difference from previous minimum
to computed minimum) and if there is scope for more iterations (computed
minimum is greater than step), and then we should be able set minimum to
do one more iteration and converge to the right minimum value.

Attached patch fixes this. Is this OK for stage-1?

Bootstrapped and regression tested on X86_64-unknown-linux-gnu with no
new regressions.

Thanks,
Kugan

gcc/

+2014-04-09  Kugan Vivekanandarajah  kug...@linaro.org
+
+   * tree-vrp.c (vrp_visit_phi_node) : Improve value ranges of loop
+   index when simplifying convergence towards minimum.
+

gcc/testsuite

+2014-04-09  Kugan Vivekanandarajah  kug...@linaro.org
+
+   * gcc.dg/tree-ssa/vrp91.c: New test
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp91.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp91.c
index e69de29..26e857c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp91.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp91.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -S -O2 -fdump-tree-vrp2 } */
+
+unsigned short data;
+void foo ()
+{
+  unsigned char  x16;
+  unsigned int i;
+  for (i = 0; i  8; i++)
+{
+  x16 = data  1;
+  data = 1;
+  if (x16 == 1)
+   {
+ data ^= 0x4;
+   }
+  data = 1;
+}
+}
+
+/* { dg-final { scan-tree-dump \\\[0, 7\\\] vrp2 } } */
+/* { dg-final { cleanup-tree-dump vrp2 } } */
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 14f1526..c63f794 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -8461,7 +8461,30 @@ vrp_visit_phi_node (gimple phi)
{
  if (!needs_overflow_infinity (TREE_TYPE (vr_result.min))
  || !vrp_var_may_overflow (lhs, phi))
-   vr_result.min = TYPE_MIN_VALUE (TREE_TYPE (vr_result.min));
+   {
+ tree step = ((cmp_min  0)  TYPE_UNSIGNED (TREE_TYPE (lhs))) ?
+   int_const_binop (MINUS_EXPR, lhs_vr-min, vr_result.min) : 
NULL_TREE;
+
+ /* If the type minimum is zero, while avoiding repeated
+iterations, let us stop at step and let the iterations take
+it to zero (if necessary) from there.  This will improve
+value ranges for cases like below, when the value range
+for ivtemp_17 is [0, 7] and range for ivtmp_19 is [1, 8].
+
+   # ivtmp_19 = PHI ivtmp_17(5), 8(2)
+   ...
+   bb 5:
+   ivtmp_17 = ivtmp_19 - 1;
+   if (ivtmp_17 != 0)
+   
+   goto bb 5;
+ */
+ if ((cmp_min  0)  (TYPE_UNSIGNED (TREE_TYPE (lhs)))
+  (tree_int_cst_compare (vr_result.min, step) != -1))
+   vr_result.min = step;
+ else
+   vr_result.min = TYPE_MIN_VALUE (TREE_TYPE (vr_result.min));
+   }
  else if (supports_overflow_infinity (TREE_TYPE (vr_result.min)))
vr_result.min =
negative_overflow_infinity (TREE_TYPE (vr_result.min));

Re: [PATCH, rs6000] Improve atomic_load/store code gen for Power8 TI mode

2014-04-09 Thread Bill Schmidt

On Wed, 2014-04-09 at 15:56 -0400, David Edelsohn wrote:
 I have reverted this on trunk and asked Bill to revert this on the 4.8
 branch. This patch is too risky to apply this close to a freeze for
 4.9.

I've reverted this on 4.8 as r209254.

Thanks,
Bill

 
 Sorry for the problems.
 
 - David
 
 
 On Wed, Apr 9, 2014 at 2:56 PM, Bill Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  On Tue, 2014-04-08 at 13:39 -0500, Pat Haugen wrote:
  On 03/25/2014 11:20 AM, Pat Haugen wrote:
   Power8 can use lq/stq instructions for TI mode atomic_load/store.
   Bootstrap/regtest with no new failures. Ok for trunk and 4.8 (once
   bootstrap/regtest finishes)?
  
   -Pat
  
  
   2014-03-25  Pat Haugen pthau...@us.ibm.com
  
   * config/rs6000/sync.md (AINT mode_iterator): Move definition.
   (loadsync_mode): Change mode.
   (atomic_loadmode, atomic_storemode): Add support for TI mode.
   (load_quadpti, store_quadpti): New.
   * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_LSQ.
  
   gcc/testsuite:
   * gcc.target/powerpc/atomic_load_store-p8.c: New.
 
  Updated patch which was approved off list and I have committed.
 
 
  Unfortunately this broke bootstrap on powerpc64le-linux-gnu on 4.8:
 
  checking for suffix of executables... 
  /home/wschmidt/gcc/gcc-4_8-base/libatomic\
  /load_n.c: In function 'libat_load_16':
  /home/wschmidt/gcc/gcc-4_8-base/libatomic/load_n.c:58:31: error: invalid 
  failur\
  e memory model for '__atomic_compare_exchange'
   atomic_compare_exchange_n (mptr, t, 0, true,
 ^
  make[4]: *** [load_16_.lo] Error 1
  make[4]: *** Waiting for unfinished jobs
 
  Thanks,
  Bill

Re: [PATCH] Fix PR c++/60765

2014-04-09 Thread Jason Merrill


OK.

Jason

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Eric Botcazou

 I got
 
 /export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/xgcc
 -B/export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/
 -B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/bin/
 -B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/lib/ -isystem
 /usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/include -isystem
 /usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/sys-include-c -g -O2
 -mx32 -fpic  -W -Wall -gnatpg -nostdinc -mx32  s-osprim.adb -o
 s-osprim.o
 s-osprim.adb:121:30: expected type Standard.Long_Long_Integer
 s-osprim.adb:121:30: found type System.Os_Primitives.time_t
 make[11]: *** [s-osprim.o] Error 1

Sorry, last minute change, try:

Index: s-osprim-x32.adb
===
--- s-osprim-x32.adb(revision 209244)
+++ s-osprim-x32.adb(working copy)
@@ -118,7 +118,7 @@ package body System.OS_Primitives is
 
   return
 timespec'(tv_sec  = S,
-  tv_nsec = time_t (Long_Long_Integer (F * 10#1#E9)));
+  tv_nsec = Long_Long_Integer (F * 10#1#E9));
end To_Timespec;
 
-


-- 
Eric Botcazou

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Eric Botcazou

 In my worthless opinion, it is a mistake to declare POSIX data types in
 s-linux.ads, they should be in s-posix.ads or similar (don't worry if
 that's a new file; and it should not be a leaf package).  Think of
 GNU/kFreeBSD and GNU/Hurd, which have nothing to do with Linux.
 Furthermore there should be only one declaration of type timespec
 (i.e. do not repeat yourself); that declaration should be in
 s-posix.ads and that declaration should violate POSIX like so:

Right, but you should have posted this message a couple of decades ago when 
this stuff was designed.  We cannot turn everything upside down now, sorry.

-- 
Eric Botcazou

Re: [PATCH] Fix PR c++/60764

2014-04-09 Thread Jason Merrill

Hmm, I would expect the parameter numbering for attribute nonnull and 
such to ignore the 'this' parameter.


Jason

Re: [RFC][PATCH][MIPS] Patch to enable LRA for MIPS backend

2014-04-09 Thread Richard Sandiford

Robert Suchanek robert.sucha...@imgtec.com writes:
 FYI, all other targets that have LRA optionally selectable or deselectable
 use -mno-lra for this (even when -mlra is the default), it would be better
 for consistency not to invent new switch names for that.

 Agreed.

 -return !strict_p || GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) 
 == 8;
 +return GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8;
  
return TARGET_MIPS16 ? M16_REG_P (regno) : GP_REG_P (regno);
  }
 Not sure about this one.  We would need to update the comment that
 explains why !strict_p is there, but AFAIK reason (1) would still apply.

 Was this needed for correctness or because it gave better code?

 !strict_p has been removed because of correctness issue. When LRA
 validates memory addresses pseudos are temporarily eliminated to hard
 registers (if possible) but the target hook is always called as
 non-strict. This only affects MIPS16 instructions with not directly
 accessible $sp. The strict variant, as I understand, was used in the
 reload pass to indicate if a pseudo-register has been allocated a hard
 register. Unless LRA should be setting the strict/non-strict depending
 on whether a temporal elimination to hard reg was successful or there
 is something else that I missed?

Hmm, OK, in that case I agree reason (2) doesn't apply.  That part was
always more of a consistency thing anyway, so I agree it's not worth
keeping around for reload.  I also had a look to see why
instantiate_virtual_regs_in_insn didn't complain about cases like:

  struct s { unsigned char c; };
  void foo (int, int, int, int, struct s);
  void bar (struct s *ptr) { foo (1, 2, 3, 4, *ptr); }

and I think it's because of the later:

2008-02-14  Michael Matz  m...@suse.de

PR target/34930
* function.c (instantiate_virtual_regs_in_insn): Reload address
before falling back to reloading the whole operand.

which correctly reloads the address if necessary.

So yeah, I agree this is right after all, sorry.  Let's delete the
comment starting at There are two problems here: at the same time.

 +  M16F_REGS,   /* mips16 + frame */

 Constraints are supposed to be operating on real registers, after
 elimination, so it seems odd to include a fake register.  What went
 wrong with just M16_REGS?

 Only the stack pointer has been added to M16_REGS.

Sorry, I'd read frame as meaning $frame, the soft frame pointer.
I agree M16_REGS + $sp is OK.

mips_regno_to_class should then map $sp to the new class, since it's now
the smallest containing class.  (We really should set that up automatically
one day...)

 A number of patterns need to accept it otherwise LRA inserts a lot of
 reloads and the code size goes up by about 10%.  The change does have
 also a positive effect on reload but marginally.  frame meant to
 indicate inclusion of both the stack and hard frame pointers in the
 class but perhaps I should name it differently to avoid confusion.

How about M16_SP_REGS, to match M16_T_REGS?

Also, the BASE_REG_CLASS/ADDR_REG_CLASS distinction isn't all that
obvious from the names.  ADDR_REG_CLASS is only needed for the d
constraint so maybe we could just use TARGET_MIPS16 ? M16_REGS : GR_REGS
directly for now.

 + SPILL_REGS, /* All but $sp and call preserved regs are in here */
...
 + { 0x0003fffc, 0x, 0x, 0x, 0x,
 0x }, /* SPILL_REGS */ \

 These two don't seem to match.  I think literally it would be 0x0300fffc,
 but maybe you had to make SPILL_REGS a superset of M16_REGs?

 I initially used 0x0300fffc but did some experiments and it turned out
 that 0x0003fffc (with $16, $17 regs) gives slightly better code. I
 haven't updated the comment though.

I can imagine including all M16_REGS makes sense, but it seems odd to
drop the 2 temporaries.  Does 0x0303fffc have the same problem?

 There is yet more to do and need to return to another thread with
 MIPS16 at some point as I found some limitations of IRA/LRA to
 generate better code. $8-$15 are currently inaccessible as temporary
 storage because these registers are marked as fixed (when optimizing
 for size) but leaving them as fixed are better for the code size. I
 don't expect a big gain by using hard registers for spilling but it
 more likely to improve the performance.

Hmm, marking them fixed was supposed to be a temporary reload-only thing,
until the move to LRA.  It should never be worse to spill to these GPRs
over spilling to the stack, if the value isn't live across a call.

But that certainly doesn't need to be part of the initial patch.

 +/* Add costs to hard registers based on frequency. This helps to negate
 +   some of the reduced cost associated with argument registers which 
 +   unfairly promotes their use and increases register pressure */
 +#define IRA_HARD_REGNO_ADD_COST_MULTIPLIER(REGNO)   \
 +  (TARGET_MIPS16  optimize_size   \
 +   ?

Re: [PATCH] Fix PR c++/60764

2014-04-09 Thread Marc Glisse


On Wed, 9 Apr 2014, Jason Merrill wrote:

Hmm, I would expect the parameter numbering for attribute nonnull and such to 
ignore the 'this' parameter.


The doc for the format attribute says clearly:

Since non-static C++ methods have an implicit this argument, the 
arguments of such methods should be counted from two, not one, when giving 
values for string-index and first-to-check.


It would be strange to count arguments differently for different 
attributes.


--
Marc Glisse

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread H.J. Lu

On Wed, Apr 9, 2014 at 2:07 PM, Eric Botcazou ebotca...@adacore.com wrote:
 I got

 /export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/xgcc
 -B/export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/
 -B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/bin/
 -B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/lib/ -isystem
 /usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/include -isystem
 /usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/sys-include-c -g -O2
 -mx32 -fpic  -W -Wall -gnatpg -nostdinc -mx32  s-osprim.adb -o
 s-osprim.o
 s-osprim.adb:121:30: expected type Standard.Long_Long_Integer
 s-osprim.adb:121:30: found type System.Os_Primitives.time_t
 make[11]: *** [s-osprim.o] Error 1

 Sorry, last minute change, try:

 Index: s-osprim-x32.adb
 ===
 --- s-osprim-x32.adb(revision 209244)
 +++ s-osprim-x32.adb(working copy)
 @@ -118,7 +118,7 @@ package body System.OS_Primitives is

return
  timespec'(tv_sec  = S,
 -  tv_nsec = time_t (Long_Long_Integer (F * 10#1#E9)));
 +  tv_nsec = Long_Long_Integer (F * 10#1#E9));
 end To_Timespec;

 -


Now I got

/export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/xgcc
-B/export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/
-B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/bin/
-B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/lib/ -isystem
/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/include -isystem
/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/sys-include-c -g -O2
-mx32 -fpic  -W -Wall -gnatpg -nostdinc -mx32  s-osinte.adb -o
s-osinte.o

s-osinte.adb:101:17: operator for type System.Linux.time_t is not
directly visible
s-osinte.adb:101:17: use clause would make operation legal
make[11]: *** [s-osinte.o] Error 1


-- 
H.J.

ping for maintainer - [PATCH] pedantic warning behavior when casting void* to ptr-to-func

2014-04-09 Thread Daniel Gutson

Hi,

   please, if at ever possible, consider this patch for 4.8.3:

http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00026.html

Thanks,

   Daniel.

-- 

Daniel F. Gutson
Chief Engineering Officer, SPD


San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina


Phone: +54 351 4217888 / +54 351 4218211

Skype: dgutson

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Eric Botcazou

 Now I got
 
 /export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/xgcc
 -B/export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/
 -B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/bin/
 -B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/lib/ -isystem
 /usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/include -isystem
 /usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/sys-include-c -g -O2
 -mx32 -fpic  -W -Wall -gnatpg -nostdinc -mx32  s-osinte.adb -o
 s-osinte.o
 
 s-osinte.adb:101:17: operator for type System.Linux.time_t is not
 directly visible
 s-osinte.adb:101:17: use clause would make operation legal
 make[11]: *** [s-osinte.o] Error 1

Probably:

Index: s-osinte-x32.adb
===
--- s-osinte-x32.adb(revision 209244)
+++ s-osinte-x32.adb(working copy)
@@ -90,6 +90,7 @@ package body System.OS_Interface is
   S : time_t;
   F : Duration;
 
+  use type System.Linux.time_t;
begin
   S := time_t (Long_Long_Integer (D));
   F := D - Duration (S);

-- 
Eric Botcazou

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread H.J. Lu

On Wed, Apr 9, 2014 at 2:59 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Now I got

 /export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/xgcc
 -B/export/build/gnu/gcc-x32/build-x86_64-linux/./gcc/
 -B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/bin/
 -B/usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/lib/ -isystem
 /usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/include -isystem
 /usr/gcc-4.9.0-x32/x86_64-unknown-linux-gnu/sys-include-c -g -O2
 -mx32 -fpic  -W -Wall -gnatpg -nostdinc -mx32  s-osinte.adb -o
 s-osinte.o

 s-osinte.adb:101:17: operator for type System.Linux.time_t is not
 directly visible
 s-osinte.adb:101:17: use clause would make operation legal
 make[11]: *** [s-osinte.o] Error 1

 Probably:

 Index: s-osinte-x32.adb
 ===
 --- s-osinte-x32.adb(revision 209244)
 +++ s-osinte-x32.adb(working copy)
 @@ -90,6 +90,7 @@ package body System.OS_Interface is
S : time_t;
F : Duration;

 +  use type System.Linux.time_t;
 begin
S := time_t (Long_Long_Integer (D));
F := D - Duration (S);


It compiles.  I will run GCC test.

Thanks.

-- 
H.J.

Re: Patch ping

2014-04-09 Thread DJ Delorie


 - http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01370.html
   PR sanitizer/56781
   fix --with-build-config=bootstrap-ubsan bootstrap of lto-plugin

I have no particular problem with this patch, although the build has
gotten beyond my full understanding these days...

However, does this fix a regression?  If not, it should wait for
stage1.

 - http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01433.html
   PR sanitizer/56781
   fix --with-build-config=bootstrap-asan bootstrap of lto-plugin

Are we really going to multilib libiberty for every useful option we
think of?  For the build/host, we have a generic way of providing
CFLAGS, and for the target we already have a multilib structure.

Re: Please revert the patches in bug #54040 and #59346 and special case x32

2014-04-09 Thread Eric Botcazou

 It compiles.  I will run GCC test.

Thanks.  I installed the fixlets in the meantime.

-- 
Eric Botcazou

[wwwdocs] Consolidate GCC web pages documentation (4/3)

2014-04-09 Thread Gerald Pfeifer

Merge the remainder of projects/web.html into about.html and
shorten the latter on the way.  Set up and adjust redirects
accordingly.

Applied.


And that's it as far as this mini project goes.  4 of 3 is already
a bit much. ;-)

Gerald


Index: about.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/about.html,v
retrieving revision 1.21
diff -u -r1.21 about.html
--- about.html  15 Mar 2014 11:44:18 -  1.21
+++ about.html  9 Apr 2014 23:13:16 -
@@ -8,14 +8,14 @@
 
 h1GCC: About/h1
 
-pThese pages are maintained by the GCC team, which consists of
-numerous
-a href=http://gcc.gnu.org/onlinedocs/gcc/Contributors.html;
-contributors/a./p
+pThese pages are maintained by the GCC team and it's easy to
+a href=../contribute.html#webchangescontribute/a./p
 
 pThe web effort was originally led by Jeff Law.  For the last decade
 or so Gerald Pfeifer has been leading the effort, but there are
-emlots/em of people who contribute./p
+many
+a href=http://gcc.gnu.org/onlinedocs/gcc/Contributors.html;contributors
+/a./p
 
 pThe web pages are under a href=#cvsCVS control/a and you
 can a href=http://gcc.gnu.org/cgi-bin/cvsweb.cgi/wwwdocs/;browse
Index: .htaccess
===
RCS file: /cvs/gcc/wwwdocs/htdocs/.htaccess,v
retrieving revision 1.30
diff -u -r1.30 .htaccess
--- .htaccess   31 Oct 2013 23:32:12 -  1.30
+++ .htaccess   9 Apr 2014 23:13:16 -
@@ -50,9 +50,10 @@
 Redirect permanent /proj-cpplib.html   
http://gcc.gnu.org/projects/cpplib.html
 Redirect permanent /proj-optimize.html 
http://gcc.gnu.org/projects/optimize.html
 Redirect permanent /projects.html  http://gcc.gnu.org/projects/
+Redirect permanent /projects/web.html  http://gcc.gnu.org/about.html
 Redirect permanent /reghunt-howto.html 
http://gcc.gnu.org/bugs/reghunt.html
 Redirect permanent /thanks.html
http://gcc.gnu.org/onlinedocs/gcc/Contributors.html
 Redirect permanent /timeline.html  
http://gcc.gnu.org/releases.html#timeline
-Redirect permanent /web.html   
http://gcc.gnu.org/projects/web.html
+Redirect permanent /web.html   http://gcc.gnu.org/about.html
 
 Redirect   /onlinedocs/ref 
http://gcc.gnu.org/onlinedocs/gcc-4.3.2/
Index: projects/web.html
===
RCS file: projects/web.html
diff -N projects/web.html
--- projects/web.html   15 Mar 2014 11:44:18 -  1.16
+++ /dev/null   1 Jan 1970 00:00:00 -
@@ -1,15 +0,0 @@
-html
-
-head
-titleGCC: Web Pages/title
-/head
-
-body
-
-h1GCC: Web Pages/h1
-
-pa href=../contribute.html#webchangesContributing changes/a
-to a href=../about.htmlour web pages/a is simple./p
-
-/body
-/html

[PATCH, x86] merge movsd/movhpd pair in peephole

2014-04-09 Thread Wei Mi

Hi,

For the testcase 1.c

#include emmintrin.h

double a[1000];

__m128d foo1() {
  __m128d res;
  res = _mm_load_sd(a[1]);
  res = _mm_loadh_pd(res, a[2]);
  return res;
}

llvm will merge movsd/movhpd to movupd while gcc will not. The merge
is beneficial on x86 machines starting from Nehalem.

The patch is to add the merging in peephole.
bootstrap and regression pass. Is it ok for stage1?

Thanks,
Wei.

gcc/ChangeLog:

2014-04-09  Wei Mi  w...@google.com

* config/i386/i386.c (get_memref_parts): New function.
(adjacent_mem_locations): Ditto.
* config/i386/i386-protos.h: Add decl for adjacent_mem_locations.
* config/i386/sse.md: Add define_peephole rule.

gcc/testsuite/ChangeLog:

2014-04-09  Wei Mi  w...@google.com

* gcc.target/i386/sse2-unaligned-mov.c: New test.

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 6e32978..3ae0d6d 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -312,6 +312,7 @@ extern enum attr_cpu ix86_schedule;
 #endif

 extern const char * ix86_output_call_insn (rtx insn, rtx call_op);
+extern bool adjacent_mem_locations (rtx mem1, rtx mem2);

 #ifdef RTX_CODE
 /* Target data for multipass lookahead scheduling.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3eefe4a..a330e84 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -46737,6 +46737,70 @@ ix86_atomic_assign_expand_fenv (tree *hold,
tree *clear, tree *update)
atomic_feraiseexcept_call);
 }

+/* Try to determine BASE/OFFSET/SIZE parts of the given MEM.
+   Return true if successful, false if all the values couldn't
+   be determined.
+
+   This function only looks for REG/SYMBOL or REG/SYMBOL+CONST
+   address forms. */
+
+static bool
+get_memref_parts (rtx mem, rtx *base, HOST_WIDE_INT *offset,
+ HOST_WIDE_INT *size)
+{
+  rtx addr_rtx;
+  if MEM_SIZE_KNOWN_P (mem)
+*size = MEM_SIZE (mem);
+  else
+return false;
+
+  if (GET_CODE (XEXP (mem, 0)) == CONST)
+addr_rtx = XEXP (XEXP (mem, 0), 0);
+  else
+addr_rtx = (XEXP (mem, 0));
+
+  if (GET_CODE (addr_rtx) == REG
+  || GET_CODE (addr_rtx) == SYMBOL_REF)
+{
+  *base = addr_rtx;
+  *offset = 0;
+}
+  else if (GET_CODE (addr_rtx) == PLUS
+   CONST_INT_P (XEXP (addr_rtx, 1)))
+{
+  *base = XEXP (addr_rtx, 0);
+  *offset = INTVAL (XEXP (addr_rtx, 1));
+}
+  else
+return false;
+
+  return true;
+}
+
+/* If MEM1 is adjacent to MEM2 and MEM1 has lower address,
+   return true.  */
+
+extern bool
+adjacent_mem_locations (rtx mem1, rtx mem2)
+{
+  rtx base1, base2;
+  HOST_WIDE_INT off1, size1, off2, size2;
+
+  if (get_memref_parts (mem1, base1, off1, size1)
+   get_memref_parts (mem2, base2, off2, size2))
+{
+  if (GET_CODE (base1) == SYMBOL_REF
+  GET_CODE (base2) == SYMBOL_REF
+  SYMBOL_REF_DECL (base1) == SYMBOL_REF_DECL (base2))
+return (off1 + size1 == off2);
+  else if (REG_P (base1)
+   REG_P (base2)
+   REGNO (base1) == REGNO (base2))
+return (off1 + size1 == off2);
+}
+  return false;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_RETURN_IN_MEMORY
 #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 72a4d6d..4bf8461 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -15606,3 +15606,37 @@
   [(set_attr type sselog1)
(set_attr length_immediate 1)
(set_attr mode TI)])
+
+;; merge movsd/movhpd to movupd when TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
+;; is true.
+(define_peephole2
+  [(set (match_operand:DF 0 register_operand)
+   (match_operand:DF 1 memory_operand))
+   (set (match_operand:V2DF 2 register_operand)
+   (vec_concat:V2DF (match_dup 0)
+(match_operand:DF 3 memory_operand)))]
+  TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
+REGNO (operands[0]) == REGNO (operands[2])
+adjacent_mem_locations (operands[1], operands[3])
+  [(set (match_dup 2)
+   (unspec:V2DF [(match_dup 4)] UNSPEC_LOADU))]
+{
+  operands[4] = gen_rtx_MEM (V2DFmode, XEXP(operands[1], 0));
+})
+
+;; merge movsd/movhpd to movupd when TARGET_SSE_UNALIGNED_STORE_OPTIMAL
+;; is true.
+(define_peephole2
+  [(set (match_operand:DF 0 memory_operand)
+(vec_select:DF (match_operand:V2DF 1 register_operand)
+  (parallel [(const_int 0)])))
+   (set (match_operand:DF 2 memory_operand)
+(vec_select:DF (match_dup 1)
+   (parallel [(const_int 1)])))]
+  TARGET_SSE_UNALIGNED_STORE_OPTIMAL
+adjacent_mem_locations (operands[0], operands[2])
+  [(set (match_dup 3)
+(unspec:V2DF [(match_dup 1)] UNSPEC_STOREU))]
+{
+  operands[3] = gen_rtx_MEM (V2DFmode, XEXP(operands[0], 0));
+})
diff --git a/gcc/testsuite/gcc.target/i386/sse2-unaligned-mov.c
b/gcc/testsuite/gcc.target/i386/sse2-unaligned-mov.c
new file mode 100644
index

Re: [PATCH, x86] merge movsd/movhpd pair in peephole

2014-04-09 Thread Bin.Cheng

On Thu, Apr 10, 2014 at 8:18 AM, Wei Mi w...@google.com wrote:
 Hi,

 For the testcase 1.c

 #include emmintrin.h

 double a[1000];

 __m128d foo1() {
   __m128d res;
   res = _mm_load_sd(a[1]);
   res = _mm_loadh_pd(res, a[2]);
   return res;
 }

 llvm will merge movsd/movhpd to movupd while gcc will not. The merge
 is beneficial on x86 machines starting from Nehalem.

 The patch is to add the merging in peephole.
 bootstrap and regression pass. Is it ok for stage1?

 Thanks,
 Wei.

 gcc/ChangeLog:

 2014-04-09  Wei Mi  w...@google.com

 * config/i386/i386.c (get_memref_parts): New function.
 (adjacent_mem_locations): Ditto.
 * config/i386/i386-protos.h: Add decl for adjacent_mem_locations.
 * config/i386/sse.md: Add define_peephole rule.

 gcc/testsuite/ChangeLog:

 2014-04-09  Wei Mi  w...@google.com

 * gcc.target/i386/sse2-unaligned-mov.c: New test.

 diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
 index 6e32978..3ae0d6d 100644
 --- a/gcc/config/i386/i386-protos.h
 +++ b/gcc/config/i386/i386-protos.h
 @@ -312,6 +312,7 @@ extern enum attr_cpu ix86_schedule;
  #endif

  extern const char * ix86_output_call_insn (rtx insn, rtx call_op);
 +extern bool adjacent_mem_locations (rtx mem1, rtx mem2);

  #ifdef RTX_CODE
  /* Target data for multipass lookahead scheduling.
 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 3eefe4a..a330e84 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -46737,6 +46737,70 @@ ix86_atomic_assign_expand_fenv (tree *hold,
 tree *clear, tree *update)
 atomic_feraiseexcept_call);
  }

 +/* Try to determine BASE/OFFSET/SIZE parts of the given MEM.
 +   Return true if successful, false if all the values couldn't
 +   be determined.
 +
 +   This function only looks for REG/SYMBOL or REG/SYMBOL+CONST
 +   address forms. */
 +
 +static bool
 +get_memref_parts (rtx mem, rtx *base, HOST_WIDE_INT *offset,
 + HOST_WIDE_INT *size)
 +{
 +  rtx addr_rtx;
 +  if MEM_SIZE_KNOWN_P (mem)
 +*size = MEM_SIZE (mem);
 +  else
 +return false;
 +
 +  if (GET_CODE (XEXP (mem, 0)) == CONST)
 +addr_rtx = XEXP (XEXP (mem, 0), 0);
 +  else
 +addr_rtx = (XEXP (mem, 0));
 +
 +  if (GET_CODE (addr_rtx) == REG
 +  || GET_CODE (addr_rtx) == SYMBOL_REF)
 +{
 +  *base = addr_rtx;
 +  *offset = 0;
 +}
 +  else if (GET_CODE (addr_rtx) == PLUS
 +   CONST_INT_P (XEXP (addr_rtx, 1)))
 +{
 +  *base = XEXP (addr_rtx, 0);
 +  *offset = INTVAL (XEXP (addr_rtx, 1));
 +}
 +  else
 +return false;
 +
 +  return true;
 +}
 +
 +/* If MEM1 is adjacent to MEM2 and MEM1 has lower address,
 +   return true.  */
 +
 +extern bool
 +adjacent_mem_locations (rtx mem1, rtx mem2)
 +{
 +  rtx base1, base2;
 +  HOST_WIDE_INT off1, size1, off2, size2;
 +
 +  if (get_memref_parts (mem1, base1, off1, size1)
 +   get_memref_parts (mem2, base2, off2, size2))
 +{
 +  if (GET_CODE (base1) == SYMBOL_REF
 +  GET_CODE (base2) == SYMBOL_REF
 +  SYMBOL_REF_DECL (base1) == SYMBOL_REF_DECL (base2))
 +return (off1 + size1 == off2);
 +  else if (REG_P (base1)
 +   REG_P (base2)
 +   REGNO (base1) == REGNO (base2))
 +return (off1 + size1 == off2);
 +}
 +  return false;
 +}
 +
  /* Initialize the GCC target structure.  */
  #undef TARGET_RETURN_IN_MEMORY
  #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
 diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
 index 72a4d6d..4bf8461 100644
 --- a/gcc/config/i386/sse.md
 +++ b/gcc/config/i386/sse.md
 @@ -15606,3 +15606,37 @@
[(set_attr type sselog1)
 (set_attr length_immediate 1)
 (set_attr mode TI)])
 +
 +;; merge movsd/movhpd to movupd when TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
 +;; is true.
 +(define_peephole2
 +  [(set (match_operand:DF 0 register_operand)
 +   (match_operand:DF 1 memory_operand))
 +   (set (match_operand:V2DF 2 register_operand)
 +   (vec_concat:V2DF (match_dup 0)
 +(match_operand:DF 3 memory_operand)))]
 +  TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
 +REGNO (operands[0]) == REGNO (operands[2])
 +adjacent_mem_locations (operands[1], operands[3])
 +  [(set (match_dup 2)
 +   (unspec:V2DF [(match_dup 4)] UNSPEC_LOADU))]
 +{
 +  operands[4] = gen_rtx_MEM (V2DFmode, XEXP(operands[1], 0));
 +})
 +
 +;; merge movsd/movhpd to movupd when TARGET_SSE_UNALIGNED_STORE_OPTIMAL
 +;; is true.
 +(define_peephole2
 +  [(set (match_operand:DF 0 memory_operand)
 +(vec_select:DF (match_operand:V2DF 1 register_operand)
 +  (parallel [(const_int 0)])))
 +   (set (match_operand:DF 2 memory_operand)
 +(vec_select:DF (match_dup 1)
 +   (parallel [(const_int 1)])))]
 +  TARGET_SSE_UNALIGNED_STORE_OPTIMAL
 +adjacent_mem_locations (operands[0], operands[2])
 +  [(set (match_dup 3)
 +(unspec:V2DF [(match_dup 1)] UNSPEC_STOREU))]
 +{
 +

Re: [PATCH, x86] merge movsd/movhpd pair in peephole

2014-04-09 Thread Wei Mi

Hi Bin,

Yes, we have the same problem that if movsd and movhpd are separated,
peephole cannot merge them. The patch could solve the motivational
performance issue we saw to a good extent, but maybe there is still
space to improve if peephole misses some pairs. Glad to know you are
working on this part. It is the same thing we want. Look forward to
your patch.

Thanks,
Wei.

On Wed, Apr 9, 2014 at 7:27 PM, Bin.Cheng amker.ch...@gmail.com wrote:
 On Thu, Apr 10, 2014 at 8:18 AM, Wei Mi w...@google.com wrote:
 Hi,

 For the testcase 1.c

 #include emmintrin.h

 double a[1000];

 __m128d foo1() {
   __m128d res;
   res = _mm_load_sd(a[1]);
   res = _mm_loadh_pd(res, a[2]);
   return res;
 }

 llvm will merge movsd/movhpd to movupd while gcc will not. The merge
 is beneficial on x86 machines starting from Nehalem.

 The patch is to add the merging in peephole.
 bootstrap and regression pass. Is it ok for stage1?

 Thanks,
 Wei.

 gcc/ChangeLog:

 2014-04-09  Wei Mi  w...@google.com

 * config/i386/i386.c (get_memref_parts): New function.
 (adjacent_mem_locations): Ditto.
 * config/i386/i386-protos.h: Add decl for adjacent_mem_locations.
 * config/i386/sse.md: Add define_peephole rule.

 gcc/testsuite/ChangeLog:

 2014-04-09  Wei Mi  w...@google.com

 * gcc.target/i386/sse2-unaligned-mov.c: New test.

 diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
 index 6e32978..3ae0d6d 100644
 --- a/gcc/config/i386/i386-protos.h
 +++ b/gcc/config/i386/i386-protos.h
 @@ -312,6 +312,7 @@ extern enum attr_cpu ix86_schedule;
  #endif

  extern const char * ix86_output_call_insn (rtx insn, rtx call_op);
 +extern bool adjacent_mem_locations (rtx mem1, rtx mem2);

  #ifdef RTX_CODE
  /* Target data for multipass lookahead scheduling.
 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 3eefe4a..a330e84 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -46737,6 +46737,70 @@ ix86_atomic_assign_expand_fenv (tree *hold,
 tree *clear, tree *update)
 atomic_feraiseexcept_call);
  }

 +/* Try to determine BASE/OFFSET/SIZE parts of the given MEM.
 +   Return true if successful, false if all the values couldn't
 +   be determined.
 +
 +   This function only looks for REG/SYMBOL or REG/SYMBOL+CONST
 +   address forms. */
 +
 +static bool
 +get_memref_parts (rtx mem, rtx *base, HOST_WIDE_INT *offset,
 + HOST_WIDE_INT *size)
 +{
 +  rtx addr_rtx;
 +  if MEM_SIZE_KNOWN_P (mem)
 +*size = MEM_SIZE (mem);
 +  else
 +return false;
 +
 +  if (GET_CODE (XEXP (mem, 0)) == CONST)
 +addr_rtx = XEXP (XEXP (mem, 0), 0);
 +  else
 +addr_rtx = (XEXP (mem, 0));
 +
 +  if (GET_CODE (addr_rtx) == REG
 +  || GET_CODE (addr_rtx) == SYMBOL_REF)
 +{
 +  *base = addr_rtx;
 +  *offset = 0;
 +}
 +  else if (GET_CODE (addr_rtx) == PLUS
 +   CONST_INT_P (XEXP (addr_rtx, 1)))
 +{
 +  *base = XEXP (addr_rtx, 0);
 +  *offset = INTVAL (XEXP (addr_rtx, 1));
 +}
 +  else
 +return false;
 +
 +  return true;
 +}
 +
 +/* If MEM1 is adjacent to MEM2 and MEM1 has lower address,
 +   return true.  */
 +
 +extern bool
 +adjacent_mem_locations (rtx mem1, rtx mem2)
 +{
 +  rtx base1, base2;
 +  HOST_WIDE_INT off1, size1, off2, size2;
 +
 +  if (get_memref_parts (mem1, base1, off1, size1)
 +   get_memref_parts (mem2, base2, off2, size2))
 +{
 +  if (GET_CODE (base1) == SYMBOL_REF
 +  GET_CODE (base2) == SYMBOL_REF
 +  SYMBOL_REF_DECL (base1) == SYMBOL_REF_DECL (base2))
 +return (off1 + size1 == off2);
 +  else if (REG_P (base1)
 +   REG_P (base2)
 +   REGNO (base1) == REGNO (base2))
 +return (off1 + size1 == off2);
 +}
 +  return false;
 +}
 +
  /* Initialize the GCC target structure.  */
  #undef TARGET_RETURN_IN_MEMORY
  #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
 diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
 index 72a4d6d..4bf8461 100644
 --- a/gcc/config/i386/sse.md
 +++ b/gcc/config/i386/sse.md
 @@ -15606,3 +15606,37 @@
[(set_attr type sselog1)
 (set_attr length_immediate 1)
 (set_attr mode TI)])
 +
 +;; merge movsd/movhpd to movupd when TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
 +;; is true.
 +(define_peephole2
 +  [(set (match_operand:DF 0 register_operand)
 +   (match_operand:DF 1 memory_operand))
 +   (set (match_operand:V2DF 2 register_operand)
 +   (vec_concat:V2DF (match_dup 0)
 +(match_operand:DF 3 memory_operand)))]
 +  TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
 +REGNO (operands[0]) == REGNO (operands[2])
 +adjacent_mem_locations (operands[1], operands[3])
 +  [(set (match_dup 2)
 +   (unspec:V2DF [(match_dup 4)] UNSPEC_LOADU))]
 +{
 +  operands[4] = gen_rtx_MEM (V2DFmode, XEXP(operands[1], 0));
 +})
 +
 +;; merge movsd/movhpd to movupd when TARGET_SSE_UNALIGNED_STORE_OPTIMAL
 +;; is true.
 +(define_peephole2
 +  [(set (match_operand:DF 0

Re: Patch ping

2014-04-09 Thread Jeff Law


On 04/09/14 07:07, Jakub Jelinek wrote:

Hi!

I'd like to ping:

- http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01370.html
   PR sanitizer/56781
   fix --with-build-config=bootstrap-ubsan bootstrap of lto-plugin

- http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01433.html
   PR sanitizer/56781
   fix --with-build-config=bootstrap-asan bootstrap of lto-plugin
Like DJ, I think these should wait until the next stage1.  They're 
primarily of interest to GCC developers and they don't fix a regression 
AFAIK.


Jeff

61 matches

Mail list logo