[Patch, Darwin] update t-* and x-* fragments after switch to auto-deps.

2013-09-28 Thread Iain Sandoe
Hi,

This updates the Darwin port {t,x}-* fragments after the switch to auto-deps 
(thanks Tom!).

bootstrapped (all langs, incl Ada) on i686-darwin9(bootstrap=gcc-4.8), 
i686-darwin10, x86-64-darwin11, x86_64-darwin12 (bootstrap=recent trunk)
powerpc-darwin9 (c,c++,lto,objc,fortran) (bootstrap=gcc-4.0.1).

OK?
Iain

gcc:

* config/t-darwin (darwin.o, darwin-c.o, darwin-f.o, darwin-driver.o):
Use COMPILE and POSTCOMPILE.
* config/x-darwin (host-darwin.o): Likewise.
* config/i386/x-darwin (host-i386-darwin.o): Likewise.
* config/rs6000/x-darwin (host-ppc-darwin.o): Likewise.
* config/rs6000/x-darwin64 (host-ppc64-darwin.o): Likewise.

diff --git a/gcc/config/i386/x-darwin b/gcc/config/i386/x-darwin
index f0196ba..4967d69 100644
--- a/gcc/config/i386/x-darwin
+++ b/gcc/config/i386/x-darwin
@@ -1,4 +1,3 @@
-host-i386-darwin.o : $(srcdir)/config/i386/host-i386-darwin.c \
-  $(CONFIG_H) $(SYSTEM_H) coretypes.h hosthooks.h $(HOSTHOOKS_DEF_H) \
-  config/host-darwin.h
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $
+host-i386-darwin.o : $(srcdir)/config/i386/host-i386-darwin.c
+   $(COMPILE) $
+   $(POSTCOMPILE)
diff --git a/gcc/config/rs6000/x-darwin b/gcc/config/rs6000/x-darwin
index 5672c69..9d92ef5 100644
--- a/gcc/config/rs6000/x-darwin
+++ b/gcc/config/rs6000/x-darwin
@@ -1,5 +1,3 @@
-host-ppc-darwin.o : $(srcdir)/config/rs6000/host-darwin.c \
-  $(CONFIG_H) $(SYSTEM_H) coretypes.h hosthooks.h $(HOSTHOOKS_DEF_H) toplev.h \
-  config/host-darwin.h $(DIAGNOSTIC_H)
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \
-   $(INCLUDES) $ -o $@
+host-ppc-darwin.o : $(srcdir)/config/rs6000/host-darwin.c
+   $(COMPILE) $
+   $(POSTCOMPILE)
diff --git a/gcc/config/rs6000/x-darwin64 b/gcc/config/rs6000/x-darwin64
index 921d555..0932771 100644
--- a/gcc/config/rs6000/x-darwin64
+++ b/gcc/config/rs6000/x-darwin64
@@ -1,5 +1,3 @@
-host-ppc64-darwin.o : $(srcdir)/config/rs6000/host-ppc64-darwin.c \
-  $(CONFIG_H) $(SYSTEM_H) coretypes.h hosthooks.h $(HOSTHOOKS_DEF_H) toplev.h \
-  config/host-darwin.h $(DIAGNOSTIC_H)
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \
-   $(INCLUDES) $ -o $@
+host-ppc64-darwin.o : $(srcdir)/config/rs6000/host-ppc64-darwin.c
+   $(COMPILE) $
+   $(POSTCOMPILE)
diff --git a/gcc/config/t-darwin b/gcc/config/t-darwin
index fdd52c2..87d5df7 100644
--- a/gcc/config/t-darwin
+++ b/gcc/config/t-darwin
@@ -18,25 +18,19 @@
 
 TM_H += $(srcdir)/config/darwin-sections.def
 
-darwin.o: $(srcdir)/config/darwin.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
-  $(TM_H) $(RTL_H) $(REGS_H) hard-reg-set.h $(REAL_H) insn-config.h \
-  conditions.h insn-flags.h output.h insn-attr.h flags.h $(TREE_H) expr.h   \
-  reload.h function.h $(GGC_H) langhooks.h $(TARGET_H) $(TM_P_H) gt-darwin.h \
-  config/darwin-sections.def
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
-   $(srcdir)/config/darwin.c
+darwin.o: $(srcdir)/config/darwin.c config/darwin-sections.def
+   $(COMPILE) $
+   $(POSTCOMPILE)
 
-darwin-c.o: $(srcdir)/config/darwin-c.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
-  $(TM_H) $(CPPLIB_H) $(TREE_H) $(C_PRAGMA_H) $(TM_P_H) \
-  incpath.h flags.h $(C_COMMON_H) $(C_TARGET_H) $(C_TARGET_DEF_H) 
$(CPP_INTERNAL_H)
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
-   $(srcdir)/config/darwin-c.c $(PREPROCESSOR_DEFINES)
+darwin-c.o: $(srcdir)/config/darwin-c.c
+   $(COMPILE) $(PREPROCESSOR_DEFINES) $
+   $(POSTCOMPILE)
 
-darwin-f.o: $(srcdir)/config/darwin-f.c $(CONFIG_H) $(SYSTEM_H) coretypes.h
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
- $(srcdir)/config/darwin-f.c $(PREPROCESSOR_DEFINES)
 
-darwin-driver.o: $(srcdir)/config/darwin-driver.c \
-  $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(GCC_H) opts.h
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
- $(srcdir)/config/darwin-driver.c
+darwin-f.o: $(srcdir)/config/darwin-f.c
+   $(COMPILE) $
+   $(POSTCOMPILE)
+
+darwin-driver.o: $(srcdir)/config/darwin-driver.c
+   $(COMPILE) $
+   $(POSTCOMPILE)
diff --git a/gcc/config/x-darwin b/gcc/config/x-darwin
index f671d91..c6226c0 100644
--- a/gcc/config/x-darwin
+++ b/gcc/config/x-darwin
@@ -1,3 +1,3 @@
-host-darwin.o : $(srcdir)/config/host-darwin.c $(CONFIG_H) $(SYSTEM_H) \
-  coretypes.h toplev.h config/host-darwin.h
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $
+host-darwin.o : $(srcdir)/config/host-darwin.c
+   $(COMPILE) $
+   $(POSTCOMPILE)



[Patch, Darwin] improve cross-compiles.

2013-09-28 Thread Iain Sandoe
I've been experimenting with the idea of building native crosses on my most 
capable machine, for the many variants of darwin we now have, and then using 
the older/slower hardware for test only.

This has uncovered a few issues with cross/native cross flags etc.

this patch adjusts the mh-darwin fragment to ensure
(i)   that PIE is disabled for gcc exes on Darwin hosts since it is 
incompatible with the current PCH implementation.
(ii)  that -mdynamic-no-pic is used for m32 hosts.

… for crosses as well as bootstraps (and, also, for stage1 compilations when 
bootstrapping on a darwin host).

OK for trunk?
Iain

config:

* mh-darwin (BOOT_CFLAGS): Only add -mdynamic-no-pic for m32 hosts.
(STAGE1_CFLAGS, STAGE1_LDFLAGS): New.
Fix over-length lines and amend comments.

diff --git a/config/mh-darwin b/config/mh-darwin
index 19bf265..a039f20 100644
--- a/config/mh-darwin
+++ b/config/mh-darwin
@@ -1,7 +1,18 @@
 # The -mdynamic-no-pic ensures that the compiler executable is built without
 # position-independent-code -- the usual default on Darwin. This fix speeds
 # compiles by 3-5%.
-BOOT_CFLAGS += -mdynamic-no-pic
+BOOT_CFLAGS += \
+`case ${host} in i?86-*-darwin* | powerpc-*-darwin*) \
+ echo -mdynamic-no-pic ;; esac;`
 
-# Ensure we don't try and use -pie, as it is incompatible with pch.
-BOOT_LDFLAGS += `case ${host} in *-*-darwin[1][1-9]*) echo -Wl,-no_pie ;; 
esac;`
+# ld on Darwin versions = 10.7 defaults to PIE executables. Disable this for
+# gcc components, since it is incompatible with our pch implementation.
+BOOT_LDFLAGS += \
+`case ${host} in *-*-darwin[1][1-9]*) echo -Wl,-no_pie ;; esac;`
+
+# Similarly, for cross-compilation.
+STAGE1_CFLAGS += \
+`case ${host} in i?86-*-darwin* | powerpc-*-darwin*)\
+ echo -mdynamic-no-pic ;; esac;`
+STAGE1_LDFLAGS += \
+`case ${host} in *-*-darwin[1][1-9]*) echo -Wl,-no_pie ;; esac;`



[Patch, Darwin/ppc] Fix altivec dwarf reg sizes.

2013-09-28 Thread Iain Sandoe
Hi!

We have this cunning legacy scheme to support unwinding on both G3 and G4/G5 
processors.  Effectively, we build some components without altivec support, and 
then test for its presence at runtime.  To doing this we pretend that altivec 
is absent when building init_unwind - and therefore all the altivec regs get a 
default size of 1 for dwarf purposes.  This, naturally, breaks the dwarf 
unwinder for altivec cases (simd-3 and 4 fail, for example).  I guess it didn't 
matter when originally authored, since STABS was the debug scheme then.

Anyway, after considerable debate about this and several approaches, here is a 
patch that just ensures we set the altivec register size to its correct value. 

I've had this in my local tree for ~ 2years ...

OK for trunk and open branches? (we're generating wrong code)
Iain

gcc:

* config/rs6000/rs6000.c (rs6000_init_dwarf_reg_sizes_extra): Ensure
that altivec registers are correctly sized on Darwin.


diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 7ff0af9..4e9a92b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -28992,6 +28992,27 @@ rs6000_init_dwarf_reg_sizes_extra (tree address)
  emit_move_insn (adjust_address (mem, mode, offset), value);
}
 }
+
+  if (TARGET_MACHO  ! TARGET_ALTIVEC)
+{
+  int i;
+  enum machine_mode mode = TYPE_MODE (char_type_node);
+  rtx addr = expand_expr (address, NULL_RTX, VOIDmode, EXPAND_NORMAL);
+  rtx mem = gen_rtx_MEM (BLKmode, addr);
+  rtx value = gen_int_mode (16, mode);
+
+  /* On Darwin, libgcc may be built to run on both G3 and G4/5.
+The unwinder still needs to know the size of Altivec registers.  */
+
+  for (i = FIRST_ALTIVEC_REGNO; i  LAST_ALTIVEC_REGNO+1; i++)
+   {
+ int column = DWARF_REG_TO_UNWIND_COLUMN (i);
+ HOST_WIDE_INT offset
+   = DWARF_FRAME_REGNUM (column) * GET_MODE_SIZE (mode);
+
+ emit_move_insn (adjust_address (mem, mode, offset), value);
+   }
+}
 }
 
 /* Map internal gcc register numbers to DWARF2 register numbers.  */



Re: Context sensitive type inheritance graph walking

2013-09-28 Thread Jan Hubicka
 Hi,
 
 sorry it took me so long, but it also took me quite a while to chew
 through.  Please consider posting context diff in cases like this.
 Nevertheless, most of the patch is a nice improvement.

Uhm, sorry. Seems I diffed from a different users.
  There is cgraph_indirect_call_info that walks GIMPLE code and attempts to
  determine the context of a given call.  It looks for objects located in
  declarations (static vars/ automatic vars/parameters), objects passed by
  invisible references and objects passed as THIS pointers.
  The second two cases are new, the first case is already done by 
  gimple_extract_devirt_binfo_from_cst
 
 and I assume we should really put the context there, rather than
 reconstructing it from the edge.  Of course we must stop overloading
 the offset field for that, are there any other obstacles?

No, i think overloading of offset is the only obstackle.  I just tried to keep
the patch self-contained and do not dive into ipa-prop changes - it is long by
itself.
  -/* See if BINFO's type match OTR_TYPE.  If so, lookup method
  -   in vtable of TYPE_BINFO and insert method to NODES array.
  +/* See if BINFO's type match OUTER_TYPE.  If so, lookup 
  +   BINFO of subtype of TYPE at OFFSET and in that BINFO find
  +   method in vtable and insert method to NODES array.
  Otherwise recurse to base BINFOs.
  This match what get_binfo_at_offset does, but with offset
  being unknown.
 
 This function now needs a comprehensive update of the leading comment,
 we have the offset,  so it is known.  I also dislike the name a lot
 because it does not record binfo, but extracts and records the call
 target from it.  Can we call it something like
 record_target_from_binfo or similar?

OK, record_target_from_binfo works for me.

We still do not know full offset (from start of the type being walked) just
partial offset within one of bases of the type.  But I will try to formulate
the comment better - it is indeed result of incremental updates.
  -  if (types_same_for_odr (type, otr_type)
  -   !pointer_set_insert (matched_vtables, BINFO_VTABLE (type_binfo)))
  +  if (types_same_for_odr (type, outer_type))
   {
  +  tree inner_binfo = get_binfo_at_offset (type_binfo,
  + offset, otr_type);
 
 OK, get_binfo_at_offset also traverses BINFO_BASEs, I wonder whether
 we need to iterate over them and recurse when types_same_for_odr
 return false, with offset, won't get_binfo_at_offset just handle both
 cases correctly?

No, it is the difference I described above.

get_binfo_at_offset assume that the offset is from start of the BINFO's type it
is given.  This is not true here.  We have derived_type that has outer_type as 
a base
that has otr_type at offset inside.

We do not know the offset in between derived_type and outer_type.  This is why 
one
function wraps the other.
  +/* Given REF call in FNDECL, determine class of the polymorphic
  +   call (OTR_TYPE), its token (OTR_TOKEN) and CONTEXT.
  +   Return pointer to object described by the context  */
  +
 
 The return value is never used, Is it ever going to be useful?
 Especially since it can be NULL even in useful cases...

Yes, it is supposed to be used by ipa-prop.  We return non-NULL when
the base may be PARM_DECL that can be furhter propagated through.

 
 
  +tree
  +get_polymorphic_call_info (tree fndecl,
  +  tree ref,
  +  tree *otr_type,
  +  HOST_WIDE_INT *otr_token,
  +  ipa_polymorphic_call_context *context)
  +{
  +  tree base_pointer;
  +  *otr_type = obj_type_ref_class (ref);
  +  *otr_token = tree_low_cst (OBJ_TYPE_REF_TOKEN (ref), 1);
  +
  +  /* Set up basic info in case we find nothing interesting in the 
  analysis.  */
  +  context-outer_type = *otr_type;
  +  context-offset = 0;
  +  base_pointer = OBJ_TYPE_REF_OBJECT (ref);
  +  context-maybe_derived_type = true;
  +  context-maybe_in_construction = false;
  +
  +  /* Walk SSA for outer object.  */
  +  do 
  +{
  +  if (TREE_CODE (base_pointer) == SSA_NAME
  +  !SSA_NAME_IS_DEFAULT_DEF (base_pointer)
  +  SSA_NAME_DEF_STMT (base_pointer)
  +  gimple_assign_single_p (SSA_NAME_DEF_STMT (base_pointer)))
  +   {
  + base_pointer = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (base_pointer));
  + STRIP_NOPS (base_pointer);
 
 If we want to put the context on the edges, we need to adjust the
 offset here.

I do not follow here, strip nops should not alter offsets, right?
  + context-offset += offset2;
  + base_pointer = NULL;
  + /* Make very conservative assumption that all objects
  +may be in construction. 
  +TODO: ipa-prop already contains code to tell better. 
  +merge it later.  */
  + context-maybe_in_construction = true;
  + context-maybe_derived_type = false;
  + return base_pointer;
 
 

[Patch, Darwin/PPC] fix PR10901

2013-09-28 Thread Iain Sandoe
Hi,
this might be the oldest bug i've fixed so far.

We currently generate wrong code for non-local gotos which breaks, amongst 
other things, nested functions.
I fixed this a while ago for x86 Darwin and here is a version to fix it on PPC.

(the patch is darwin-local save the definitions of the UNSPECs).

this has been in my (and Dominique's) ppc tree for some time,

OK for trunk? (and open branches?) - long-standing, wrong-code bug.
Iain

gcc:
PR target/10901
* config/darwin-protos.h (machopic_get_function_picbase): New.
* config/darwin.c (machopic_get_function_picbase): New.
* config/rs6000/darwin.md (load_macho_picbase_si): Update picbase
label for a new func.  (load_macho_picbase_di): Likewise.
(reload_macho_picbase): New expand.
(reload_macho_picbase_si): New insn.
(reload_macho_picbase_di): New insn.
(nonlocal_goto_receiver): New define and split.
* config/rs6000/rs6000.md (unspec enum): Add UNSPEC_RELD_MPIC.
(unspecv enum): Add UNSPECV_NLGR.


diff --git a/gcc/config/darwin-protos.h b/gcc/config/darwin-protos.h
index 36d16b9..fe43ef3 100644
--- a/gcc/config/darwin-protos.h
+++ b/gcc/config/darwin-protos.h
@@ -26,6 +26,7 @@ extern void machopic_output_function_base_name (FILE *);
 extern const char *machopic_indirection_name (rtx, bool);
 extern const char *machopic_mcount_stub_name (void);
 extern bool machopic_should_output_picbase_label (void);
+extern const char *machopic_get_function_picbase (void);
 
 #ifdef RTX_CODE
 
diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
index ab48558..cb1bc38 100644
--- a/gcc/config/darwin.c
+++ b/gcc/config/darwin.c
@@ -405,6 +405,19 @@ machopic_output_function_base_name (FILE *file)
   fprintf (file, L%d$pb, current_pic_label_num);
 }
 
+char curr_picbasename[32];
+
+const char *
+machopic_get_function_picbase (void)
+{
+  /* If dynamic-no-pic is on, we should not get here.  */
+  gcc_assert (!MACHO_DYNAMIC_NO_PIC_P);
+
+  update_pic_label_number_if_needed ();
+  snprintf (curr_picbasename, 32, L%d$pb, current_pic_label_num);
+  return (const char *) curr_picbasename;
+}
+
 bool
 machopic_should_output_picbase_label (void)
 {
diff --git a/gcc/config/rs6000/darwin.md b/gcc/config/rs6000/darwin.md
index 24e8cfa..0fb2422 100644
--- a/gcc/config/rs6000/darwin.md
+++ b/gcc/config/rs6000/darwin.md
@@ -260,7 +260,10 @@ You should have received a copy of the GNU General Public 
License
(unspec:SI [(match_operand:SI 0 immediate_operand s)
(pc)] UNSPEC_LD_MPIC))]
   (DEFAULT_ABI == ABI_DARWIN)  flag_pic
-  bcl 20,31,%0\\n%0:
+{
+  machopic_should_output_picbase_label (); /* Update for new func.  */
+  return bcl 20,31,%0\\n%0:;
+}
   [(set_attr type branch)
(set_attr length 4)])
 
@@ -269,7 +272,10 @@ You should have received a copy of the GNU General Public 
License
(unspec:DI [(match_operand:DI 0 immediate_operand s)
(pc)] UNSPEC_LD_MPIC))]
   (DEFAULT_ABI == ABI_DARWIN)  flag_pic  TARGET_64BIT
-  bcl 20,31,%0\\n%0:
+{
+  machopic_should_output_picbase_label (); /* Update for new func.  */
+  return bcl 20,31,%0\\n%0:;
+}
   [(set_attr type branch)
(set_attr length 4)])
 
@@ -370,3 +376,86 @@ You should have received a copy of the GNU General Public 
License
 }
   [(set_attr type branch,branch)
(set_attr length 4,8)])
+
+(define_expand reload_macho_picbase
+  [(set (reg:SI 65)
+(unspec [(match_operand 0  )]
+   UNSPEC_RELD_MPIC))]
+  (DEFAULT_ABI == ABI_DARWIN)  flag_pic
+{
+  if (TARGET_32BIT)
+emit_insn (gen_reload_macho_picbase_si (operands[0]));
+  else
+emit_insn (gen_reload_macho_picbase_di (operands[0]));
+
+  DONE;
+})
+
+(define_insn reload_macho_picbase_si
+  [(set (reg:SI 65)
+(unspec:SI [(match_operand:SI 0 immediate_operand s)
+   (pc)] UNSPEC_RELD_MPIC))]
+  (DEFAULT_ABI == ABI_DARWIN)  flag_pic
+{
+  if (machopic_should_output_picbase_label ())
+{
+  static char tmp[64];
+  const char *cnam = machopic_get_function_picbase ();
+  snprintf (tmp, 64, bcl 20,31,%s\\n%s:\\n%%0:, cnam, cnam);
+  return tmp;
+}
+  else
+return bcl 20,31,%0\\n%0:;
+}
+  [(set_attr type branch)
+   (set_attr length 4)])
+
+(define_insn reload_macho_picbase_di
+  [(set (reg:DI 65)
+   (unspec:DI [(match_operand:DI 0 immediate_operand s)
+   (pc)] UNSPEC_RELD_MPIC))]
+  (DEFAULT_ABI == ABI_DARWIN)  flag_pic  TARGET_64BIT
+{
+  if (machopic_should_output_picbase_label ())
+{
+  static char tmp[64];
+  const char *cnam = machopic_get_function_picbase ();
+  snprintf (tmp, 64, bcl 20,31,%s\\n%s:\\n%%0:, cnam, cnam);
+  return tmp;
+}
+  else
+return bcl 20,31,%0\\n%0:;
+}
+  [(set_attr type branch)
+   (set_attr length 4)])
+
+;; We need to restore the PIC register, at the site of nonlocal label.
+
+(define_insn_and_split nonlocal_goto_receiver
+  [(unspec_volatile [(const_int 0)] UNSPECV_NLGR)]

Re: [RFC Patch, Aarch64] : Macros for profile code generation to enable gprof support

2013-09-28 Thread Venkataramanan Kumar
Hi Marcus,

I have re-based the patch and tested for aarch64-none-elf with no regressions.
Also for aarch64-unknown-linux-gnu the following test cases passes.

Before:

UNSUPPORTED: gcc.dg/nested-func-4.c
UNSUPPORTED: gcc.dg/pr43643.c:
UNSUPPORTED: gcc.dg/nest.c
UNSUPPORTED: gcc.dg/20021014-1.c
UNSUPPORTED: gcc.dg/pr32450.c
UNSUPPORTED: g++.dg/other/profile1.C -std=gnu++98
UNSUPPORTED: g++.dg/other/profile1.C -std=gnu++11

After:
---
PASS: gcc.dg/nested-func-4.c (test for excess errors)
PASS: gcc.dg/nested-func-4.c execution test
PASS: gcc.dg/pr43643.c (test for excess errors)
PASS: gcc.dg/pr43643.c execution test
PASS: gcc.dg/nest.c (test for excess errors)
PASS: gcc.dg/nest.c execution test
PASS: gcc.dg/20021014-1.c (test for excess errors)
PASS: gcc.dg/20021014-1.c execution test
PASS: gcc.dg/pr32450.c (test for excess errors)
PASS: gcc.dg/pr32450.c execution test
PASS: g++.dg/other/profile1.C -std=gnu++98 (test for excess errors)
PASS: g++.dg/other/profile1.C -std=gnu++98 execution test
PASS: g++.dg/other/profile1.C -std=gnu++11 (test for excess errors)
PASS: g++.dg/other/profile1.C -std=gnu++11 execution test

Please let me know if I can commit it to trunk, given that glibc
patches are upstreamed.

2013-10-28  Venkataramanan Kumar  venkataramanan.ku...@linaro.org

   * config/aarch64/aarch64.h (MCOUNT_NAME): Define.
   (NO_PROFILE_COUNTERS): Likewise.
   (PROFILE_HOOK): Likewise.
   (FUNCTION_PROFILER): Likewise.
   *  config/aarch64/aarch64.c (aarch64_function_profiler): Remove.

regards,
Venkat.

On 27 August 2013 13:05, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:
 Hi Venkat,

 On 3 August 2013 19:01, Venkataramanan Kumar
 venkataramanan.ku...@linaro.org wrote:

 This patch adds macros to support gprof in Aarch64. The difference
 from the previous patch is that the compiler, while generating
 mcount routine for an instrumented function, also passes the return
 address as argument.

 The mcount routine in glibc will be modified as follows.

 (-Snip-)
  #define MCOUNT \
 -void __mcount (void)
  \
 +void __mcount (void* frompc)
\
  {   
  \
 -  mcount_internal ((u_long) RETURN_ADDRESS (1), (u_long) RETURN_ADDRESS 
 (0)); \
 +  mcount_internal ((u_long) frompc, (u_long) RETURN_ADDRESS (0)); \
  }
 (-Snip-)


 If this is Ok I will send the patch to glibc as well.

 2013-08-02  Venkataramanan Kumar  venkataramanan.ku...@linaro.org

  * config/aarch64/aarch64.h (MCOUNT_NAME): Define.
(NO_PROFILE_COUNTERS): Likewise.
(PROFILE_HOOK): Likewise.
(FUNCTION_PROFILER): Likewise.
 *  config/aarch64/aarch64.c (aarch64_function_profiler): Remove.
.

 regards,
 Venkat.

 +  emit_library_call (fun, LCT_NORMAL, VOIDmode, 1,lr,Pmode); \
 +}

 GNU coding style requires spaces after the commas, but otherwise I
 have no further comments on this patch. Post the glibc patch please.

 Thanks
 /Marcus
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c(revision 202934)
+++ gcc/config/aarch64/aarch64.c(working copy)
@@ -3857,13 +3857,6 @@
   output_addr_const (f, x);
 }
 
-void
-aarch64_function_profiler (FILE *f ATTRIBUTE_UNUSED,
-  int labelno ATTRIBUTE_UNUSED)
-{
-  sorry (function profiling);
-}
-
 bool
 aarch64_label_mentioned_p (rtx x)
 {
Index: gcc/config/aarch64/aarch64.h
===
--- gcc/config/aarch64/aarch64.h(revision 202934)
+++ gcc/config/aarch64/aarch64.h(working copy)
@@ -783,9 +783,23 @@
 #define PRINT_OPERAND_ADDRESS(STREAM, X) \
   aarch64_print_operand_address (STREAM, X)
 
-#define FUNCTION_PROFILER(STREAM, LABELNO) \
-  aarch64_function_profiler (STREAM, LABELNO)
+#define MCOUNT_NAME _mcount
 
+#define NO_PROFILE_COUNTERS 1
+
+/* Emit rtl for profiling.  Output assembler code to FILE
+   to call _mcount for profiling a function entry.  */
+#define PROFILE_HOOK(LABEL)\
+{  \
+  rtx fun,lr;  \
+  lr = get_hard_reg_initial_val (Pmode, LR_REGNUM);\
+  fun = gen_rtx_SYMBOL_REF (Pmode, MCOUNT_NAME);   \
+  emit_library_call (fun, LCT_NORMAL, VOIDmode, 1, lr, Pmode); \
+}
+
+/* All the work done in PROFILE_HOOK, but still required.  */
+#define FUNCTION_PROFILER(STREAM, LABELNO) do { } while (0)
+
 /* For some reason, the Linux headers think they know how to define
these macros.  They don't!!!  */
 #undef ASM_APP_ON
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   (revision 

Re: Add value range support into memcpy/memset expansion

2013-09-28 Thread Jan Hubicka
 Nice extension. Test cases would be great to have.
Fore those you need i386 changes to actually use the info.  I will post that
after some cleanup and additional testing.

Honza


RFA [testsuite]: New ARC target specific tests

2013-09-28 Thread Joern Rennecke

This patch adds a number of tests for ARC target specific options.

I'm a bit uncertain here if I still need approval for this patch.
On the one hand the changes are all in an area that is normally within
the remit of a target maintainer, and patch to add the gcc.target/arc
directory has already been accepted.
OTOH, the man body of the ARC port is still stuck waiting for review,
so I'm still in the weird position of a target maintainer without an
accepted target port.
2013-09-28  Simon Cook  simon.c...@embecosm.com
Joern Rennecke  joern.renne...@embecosm.com

* gcc.target/arc/barrel-shifter-1.c: New test.
* gcc.target/arc/barrel-shifter-2.c: Likewise.
* gcc.target/arc/long-calls.c, gcc.target/arc/mA6.c: Likewise.
* gcc.target/arc/mA7.c, gcc.target/arc/mARC600.c: Likewise.
* gcc.target/arc/mARC601.c, gcc.target/arc/mARC700.c: Likewise.
* gcc.target/arc/mcpu-arc600.c, gcc.target/arc/mcpu-arc601.c: Likewise.
* gcc.target/arc/mcpu-arc700.c, gcc.target/arc/mcrc.c: Likewise.
* gcc.target/arc/mdpfp.c, gcc.target/arc/mdsp-packa.c: Likewise.
* gcc.target/arc/mdvbf.c, gcc.target/arc/mlock.c: Likewise.
* gcc.target/arc/mmac-24.c, gcc.target/arc/mmac-d16.c: Likewise.
* gcc.target/arc/mno-crc.c, gcc.target/arc/mno-dsp-packa.c: Likewise.
* gcc.target/arc/mno-dvbf.c, gcc.target/arc/mno-lock.c: Likewise.
* gcc.target/arc/mno-mac-24.c, gcc.target/arc/mno-mac-d16.c: Likewise.
* gcc.target/arc/mno-rtsc.c, gcc.target/arc/mno-swape.c: Likewise.
* gcc.target/arc/mno-xy.c, gcc.target/arc/mrtsc.c: Likewise.
* gcc.target/arc/mspfp.c, gcc.target/arc/mswape.c: Likewise.
* gcc.target/arc/mtune-ARC600.c: Likewise.
* gcc.target/arc/mtune-ARC601.c: Likewise.
* gcc.target/arc/mtune-ARC700-xmac: Likewise.
* gcc.target/arc/mtune-ARC700.c: Likewise.
* gcc.target/arc/mtune-ARC725D.c: Likewise.
* gcc.target/arc/mtune-ARC750D.c: Likewise.
* gcc.target/arc/mul64.c, gcc.target/arc/mxy.c: Likewise.
* gcc.target/arc/no-dpfp-lrsr.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/arc/barrel-shifter-1.c 
b/gcc/testsuite/gcc.target/arc/barrel-shifter-1.c
new file mode 100644
index 000..a0eb6d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/barrel-shifter-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -mcpu=ARC601 -mbarrel-shifter } */
+int i;
+
+int f (void)
+{
+  i = 2;
+}
+
+/* { dg-final { scan-assembler asr_s } } */
diff --git a/gcc/testsuite/gcc.target/arc/barrel-shifter-2.c 
b/gcc/testsuite/gcc.target/arc/barrel-shifter-2.c
new file mode 100644
index 000..97998fb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/barrel-shifter-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+int i;
+
+int f (void)
+{
+  i = 2;
+}
+
+/* { dg-final { scan-assembler asr_s } } */
diff --git a/gcc/testsuite/gcc.target/arc/long-calls.c 
b/gcc/testsuite/gcc.target/arc/long-calls.c
new file mode 100644
index 000..63fafbc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/long-calls.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -mlong-calls } */
+
+int g (void);
+
+int f (void)
+{
+g();
+}
+
+/* { dg-final { scan-assembler j @g } } */
diff --git a/gcc/testsuite/gcc.target/arc/mA6.c 
b/gcc/testsuite/gcc.target/arc/mA6.c
new file mode 100644
index 000..2e15a86
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/mA6.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options -mA6 } */
+
+/* { dg-final { scan-assembler .cpu ARC600 } } */
diff --git a/gcc/testsuite/gcc.target/arc/mA7.c 
b/gcc/testsuite/gcc.target/arc/mA7.c
new file mode 100644
index 000..c4430f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/mA7.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options -mA7 } */
+
+/* { dg-final { scan-assembler .cpu ARC700 } } */
diff --git a/gcc/testsuite/gcc.target/arc/mARC600.c 
b/gcc/testsuite/gcc.target/arc/mARC600.c
new file mode 100644
index 000..20e086a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/mARC600.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options -mARC600 } */
+
+/* { dg-final { scan-assembler .cpu ARC600 } } */
diff --git a/gcc/testsuite/gcc.target/arc/mARC601.c 
b/gcc/testsuite/gcc.target/arc/mARC601.c
new file mode 100644
index 000..1d30da4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/mARC601.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options -mARC601 } */
+
+/* { dg-final { scan-assembler .cpu ARC601 } } */
diff --git a/gcc/testsuite/gcc.target/arc/mARC700.c 
b/gcc/testsuite/gcc.target/arc/mARC700.c
new file mode 100644
index 000..43e9baa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/mARC700.c
@@ -0,0 +1,4 @@
+/* { dg-do compile } */
+/* { dg-options -mARC700 } */
+
+/* { dg-final { scan-assembler .cpu ARC700 } } */
diff --git a/gcc/testsuite/gcc.target/arc/mcpu-arc600.c 
b/gcc/testsuite/gcc.target/arc/mcpu-arc600.c
new file mode 100644
index 

Re: [Patch] Let ordinary escaping in POSIX regex be valid

2013-09-28 Thread Tim Shen
On Fri, Sep 27, 2013 at 4:30 PM, Paolo Carlini paolo.carl...@oracle.com wrote:
 Nah, only double check that the testcase you are un-xfail-ing uses 
 -std=gnu++11, otherwise will not pass ;)

Committed :)

Thanks!


-- 
Tim Shen


Ping^6: contribute Synopsys Designware ARC port

2013-09-28 Thread Joern Rennecke

The main part of the port (everything but the testsuite) is still waiting
for review:
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00323.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00324.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00325.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00328.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01870.html
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02070.html

I've retested a i686-pc-linux-gnu native bootstrap as well as the obvious
arc-elf32 / arc-linux-uclibc builds in trunk r202981.


Re: [PATCH] Trivial cleanup

2013-09-28 Thread Andrew MacLeod

On 09/27/2013 01:03 AM, Jeff Law wrote:

On 09/26/2013 08:15 AM, Michael Matz wrote:

Hi,

On Wed, 25 Sep 2013, Jeff Law wrote:


I was going to bring it up at some point too.  My preference is
strongly to simply eliminate the space on methods...
Which wouldn't be so weird: in the libstdc++-v3 code we do it all 
the time.
Yea.  I actually reviewed the libstdc++ guidelines to see where they 
differed

from GNU's C guidelines.

I'm strongly in favor of dropping the horizontal whitespace between the
method name and its open paren when the result is then dereferenced.
ie foo.last()-e rather than foo.last ()-e.


I'd prefer to not write in this style at all, like Jakub.  If we must
absolutely have it, then I agree that the space before _empty_ 
parentheses

are ugly if followed by references.  I.e. I'd like to see spaces before
parens as is customary, except in one case: empty parens in the 
middle of

expressions (which don't happen very often right now in GCC, and hence
wouldn't introduce a coding style confusion):

do.this ();
give.that()-flag;
get.list (one)-clear ();

I'd prefer to not have further references to return values be applied,
though (as in, the parentheses should be the end of statement), which
would avoid the topic (at the expensive of having to invent names for
those temporaries, or to write trivial wrapper methods contracting 
several

method calls).
Should we consider banning dereferencing the result of a method call 
and instead prefer to use a more functional interface such as Jakub 
has suggested, or have the result of the method call put into a 
temporary and dereference the temporary.


I considered suggesting the latter.  I wouldn't be a huge fan of the 
unnecessary temporaries, but they may be better than the horrid 
foo.last()-argh()-e-src or whatever.


Stuffing the result into a temporary does have one advantage, it 
encourages us to CSE across the method calls in cases where the 
compiler might not be able to do so.  Of course, being humans, we'll 
probably mess it up.


jeff
I don't like the more functional interface... I thought the suggestion 
might be a little tongue in cheek, but wasn't sure :-)  I can't imagine 
the number of templates that would introduce... and the impact on 
compile/link time would probably not be trivial.


temps would be OK with me, but there are a couple of concerns.
 - I'd want to be able to declare the temps at the point of use, not 
the top of the function. this would actually help with clarity I think. 
Not sure what the current coding standard says about that.
 - the compiler better do an awesome job of sharing stack  space for 
user variables in a function... I wouldn't want to blow up the stack 
with a bazillion unrelatd temps each wit their own location.


My example in this form would look something like:
int unsignedsrcp = ptrvar.type().type().type_unsigned();

...
GimpleType t1 = ptrvar.type ();
GimpleType t2 = t1.type ();
int unsignedsrcp = t2.type.unsigned ();

And yes, we'll probably introduce the odd human CSE error.. hopefully 
the test suite will catch them :-)


I think I still prefer matz's suggestion, but I could be on board with 
this one too.   some expressions are crazy complicated


Andrew


[PATCH][RFC] fix reload causing ICE in subreg_get_info on m68k (PR58369)

2013-09-28 Thread Mikael Pettersson
This patch fixes PR58369, an ICE in subreg_get_info when compiling
boost for m68k-linux.

choose_reload_regs attempts to reload a DFmode (8-byte) reg, finds
an XFmode (12-byte) reg in last_reg, and calls subreg_regno_offset
with these two modes and a subreg offset of zero.  However, this is
not a correct lowpart subreg offset for big-endian and these two modes,
so the lowpart subreg check in subreg_get_info fails, and the code
continues to

gcc_assert ((GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)) == 0);

which fails because (12 % 8) != 0.

choose_reload_regs passes the constant zero, in all cases where the reg
isn't already a subreg, as the subreg offset to subreg_regno_offset, even
though lowpart subregs on big-endian targets require an explicit offset
computation.  I think that is a bug.

I believe other big-endian targets don't see this ICE because
a) they define CANNOT_CHANGE_MODE_CLASS to reject differently-sized
   modes in floating-point registers (which prevents this path in
   choose_reload_regs), or
b) their differently-sized modes are such that the size of a larger
   mode is a whole multiple of the size of the smaller mode (which
   allows the gcc_assert above to pass).

This patch changes choose_reload_regs to call subreg_lowpart_offset
to pass an endian-correct offset to subreg_regno_offset, except where
the offset comes from a pre-existing subreg.

[Defining CANNOT_CHANGE_MODE_CLASS appropriately for m68k also fixes
the ICE, but I don't think the m68k backend really wants that, and I
think it just papers over a generic bug.]

Tested with trunk and 4.8 on {m68k,sparc64,powerpc64}-linux (big-endian),
and on x86_64-linux/armv5tel-linux-gnueabi (little-endian).  No regressions.

Comments?
Is this Ok for trunk?

gcc/

2013-09-28  Mikael Pettersson  mikpeli...@gmail.com

PR rtl-optimization/58369
* reload1.c (choose_reload_regs): Use subreg_lowpart_offset
to pass endian-correct lowpart offset to subreg_regno_offset.

--- gcc-4.9-20130922/gcc/reload1.c.~1~  2013-09-09 15:07:10.0 +0200
+++ gcc-4.9-20130922/gcc/reload1.c  2013-09-28 16:24:21.068294912 +0200
@@ -6497,6 +6497,7 @@ choose_reload_regs (struct insn_chain *c
  if (inheritance)
{
  int byte = 0;
+ bool byte_is_fixed = false;
  int regno = -1;
  enum machine_mode mode = VOIDmode;
 
@@ -6519,7 +6520,10 @@ choose_reload_regs (struct insn_chain *c
  if (regno  FIRST_PSEUDO_REGISTER)
regno = subreg_regno (rld[r].in_reg);
  else
-   byte = SUBREG_BYTE (rld[r].in_reg);
+   {
+ byte = SUBREG_BYTE (rld[r].in_reg);
+ byte_is_fixed = true;
+   }
  mode = GET_MODE (rld[r].in_reg);
}
 #ifdef AUTO_INC_DEC
@@ -6557,6 +6561,8 @@ choose_reload_regs (struct insn_chain *c
  rtx last_reg = reg_last_reload_reg[regno];
 
  i = REGNO (last_reg);
+ if (! byte_is_fixed)
+   byte = subreg_lowpart_offset (mode, GET_MODE (last_reg));
  i += subreg_regno_offset (i, GET_MODE (last_reg), byte, mode);
  last_class = REGNO_REG_CLASS (i);
 


Re: [PATCH] Relax the requirement of reduction pattern in GCC vectorizer.

2013-09-28 Thread Xinliang David Li
You can also add a test case of this form:

int foo( int t, int n, int *dst)
{
   int j = 0;
   int s = 1;
   t++;
   for (j = 0; j  n; j++)
 {
 dst[j] = t;
 s *= t;
 }

   return s;
}

where without the fix the loop vectorization is missed.

David

On Fri, Sep 27, 2013 at 6:28 PM, Cong Hou co...@google.com wrote:
 The current GCC vectorizer requires the following pattern as a simple
 reduction computation:

loop_header:
  a1 = phi  a0, a2 
  a3 = ...
  a2 = operation (a3, a1)

 But a3 can also be defined outside of the loop. For example, the
 following loop can benefit from vectorization but the GCC vectorizer
 fails to vectorize it:


 int foo(int v)
 {
   int s = 1;
   ++v;
   for (int i = 0; i  10; ++i)
 s *= v;
   return s;
 }


 This patch relaxes the original requirement by also considering the
 following pattern:


a3 = ...
loop_header:
  a1 = phi  a0, a2 
  a2 = operation (a3, a1)


 A test case is also added. The patch is tested on x86-64.


 thanks,
 Cong

 

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 39c786e..45c1667 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,9 @@
 +2013-09-27  Cong Hou  co...@google.com
 +
 + * tree-vect-loop.c: Relax the requirement of the reduction
 + pattern so that one operand of the reduction operation can
 + come from outside of the loop.
 +
  2013-09-25  Tom Tromey  tro...@redhat.com

   * Makefile.in (PARTITION_H, LTO_SYMTAB_H, COMMON_TARGET_DEF_H)
 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
 index 09644d2..90496a2 100644
 --- a/gcc/testsuite/ChangeLog
 +++ b/gcc/testsuite/ChangeLog
 @@ -1,3 +1,7 @@
 +2013-09-27  Cong Hou  co...@google.com
 +
 + * gcc.dg/vect/vect-reduc-pattern-3.c: New test.
 +
  2013-09-25  Marek Polacek  pola...@redhat.com

   PR sanitizer/58413
 diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
 index 2871ba1..3c51c3b 100644
 --- a/gcc/tree-vect-loop.c
 +++ b/gcc/tree-vect-loop.c
 @@ -2091,6 +2091,13 @@ vect_is_slp_reduction (loop_vec_info loop_info,
 gimple phi, gimple first_stmt)
   a3 = ...
   a2 = operation (a3, a1)

 +   or
 +
 +   a3 = ...
 +   loop_header:
 + a1 = phi  a0, a2 
 + a2 = operation (a3, a1)
 +
 such that:
 1. operation is commutative and associative and it is safe to
change the order of the computation (if CHECK_REDUCTION is true)
 @@ -2451,6 +2458,7 @@ vect_is_simple_reduction_1 (loop_vec_info
 loop_info, gimple phi,
if (def2  def2 == phi
 (code == COND_EXPR
|| !def1 || gimple_nop_p (def1)
 +  || !flow_bb_inside_loop_p (loop, gimple_bb (def1))
|| (def1  flow_bb_inside_loop_p (loop, gimple_bb (def1))
 (is_gimple_assign (def1)
|| is_gimple_call (def1)
 @@ -2469,6 +2477,7 @@ vect_is_simple_reduction_1 (loop_vec_info
 loop_info, gimple phi,
if (def1  def1 == phi
 (code == COND_EXPR
|| !def2 || gimple_nop_p (def2)
 +  || !flow_bb_inside_loop_p (loop, gimple_bb (def2))
|| (def2  flow_bb_inside_loop_p (loop, gimple_bb (def2))
 (is_gimple_assign (def2)
|| is_gimple_call (def2)
 diff --git gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-3.c
 gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-3.c
 new file mode 100644
 index 000..06a9416
 --- /dev/null
 +++ gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-3.c
 @@ -0,0 +1,41 @@
 +/* { dg-require-effective-target vect_int } */
 +
 +#include stdarg.h
 +#include tree-vect.h
 +
 +#define N 10
 +#define RES 1024
 +
 +/* A reduction pattern in which there is no data ref in
 +   the loop and one operand is defined outside of the loop.  */
 +
 +__attribute__ ((noinline)) int
 +foo (int v)
 +{
 +  int i;
 +  int result = 1;
 +
 +  ++v;
 +  for (i = 0; i  N; i++)
 +result *= v;
 +
 +  return result;
 +}
 +
 +int
 +main (void)
 +{
 +  int res;
 +
 +  check_vect ();
 +
 +  res = foo (1);
 +  if (res != RES)
 +abort ();
 +
 +  return 0;
 +}
 +
 +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */
 +/* { dg-final { cleanup-tree-dump vect } } */
 +


Re: [Patch, Darwin] update t-* and x-* fragments after switch to auto-deps.

2013-09-28 Thread Joseph S. Myers
On Sat, 28 Sep 2013, Iain Sandoe wrote:

   * config/t-darwin (darwin.o, darwin-c.o, darwin-f.o, darwin-driver.o):
   Use COMPILE and POSTCOMPILE.
   * config/x-darwin (host-darwin.o): Likewise.
   * config/i386/x-darwin (host-i386-darwin.o): Likewise.
   * config/rs6000/x-darwin (host-ppc-darwin.o): Likewise.
   * config/rs6000/x-darwin64 (host-ppc64-darwin.o): Likewise.

Do you need these compilation rules at all?  Or could you change 
config.host to use paths such as config/host-darwin.o rather than just 
host-darwin.o, and so allow the generic rules to be used (my understanding 
was that the auto-deps patch series made lots of such changes to the 
locations of .o files in the build tree to avoid needing special 
compilation rules for particular files)?

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Patch, Darwin] update t-* and x-* fragments after switch to auto-deps.

2013-09-28 Thread Iain Sandoe

On 28 Sep 2013, at 17:40, Joseph S. Myers wrote:

 On Sat, 28 Sep 2013, Iain Sandoe wrote:
 
  * config/t-darwin (darwin.o, darwin-c.o, darwin-f.o, darwin-driver.o):
  Use COMPILE and POSTCOMPILE.
  * config/x-darwin (host-darwin.o): Likewise.
  * config/i386/x-darwin (host-i386-darwin.o): Likewise.
  * config/rs6000/x-darwin (host-ppc-darwin.o): Likewise.
  * config/rs6000/x-darwin64 (host-ppc64-darwin.o): Likewise.
 
 Do you need these compilation rules at all?  Or could you change 
 config.host to use paths such as config/host-darwin.o rather than just 
 host-darwin.o, and so allow the generic rules to be used (my understanding 
 was that the auto-deps patch series made lots of such changes to the 
 locations of .o files in the build tree to avoid needing special 
 compilation rules for particular files)?

Good id, I'll investigate this.
Iain



[PATCH] Fix bootstrap with java on multiarch systems

2013-09-28 Thread John David Anglin

OK to backport the attached change to 4.7 and 4.8?

Dave
--
John David Anglin   dave.ang...@bell.net

2013-09-28  John David Anglin  dang...@gcc.gnu.org

PR driver/58505
Backport from mainline:
2013-05-22  Matthias Klose  d...@ubuntu.com

* jvspec.c (jvgenmain_spec): Add %I to cc1 call.

Index: jvspec.c
===
--- jvspec.c(revision 202859)
+++ jvspec.c(working copy)
@@ -59,7 +59,7 @@
   jvgenmain %{findirect-dispatch} %{D*} %b %m.i |\n\
cc1 %m.i %1 \
   %{!Q:-quiet} -dumpbase %b.c %{d*} %{m*}\
-  %{g*} %{O*} \
+  %{g*} %{O*} %I \
   %{v:-version} %{pg:-p} %{p}\
   %fbounds-check %fno-bounds-check\
   %fassume-compiled* %fno-assume-compiled*\


Go patch committed: Avoid useless knockon errors for _

2013-09-28 Thread Ian Lance Taylor
This patch to the Go compiler avoids useless knockon errors for invalid
uses of the blank identifier _.  I added a simple general facility for
erroneous names although it is currently only used for _.  Bootstrapped
and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to mainline
and 4.8 branch.

Ian

diff -r a8b1cc175cc1 go/gogo-tree.cc
--- a/go/gogo-tree.cc	Fri Sep 27 15:11:52 2013 -0700
+++ b/go/gogo-tree.cc	Sat Sep 28 13:21:19 2013 -0700
@@ -1061,6 +1061,12 @@
   if (this-tree_ != NULL_TREE)
 return this-tree_;
 
+  if (Gogo::is_erroneous_name(this-name_))
+{
+  this-tree_ = error_mark_node;
+  return error_mark_node;
+}
+
   tree name;
   if (this-classification_ == NAMED_OBJECT_TYPE)
 name = NULL_TREE;
diff -r a8b1cc175cc1 go/gogo.cc
--- a/go/gogo.cc	Fri Sep 27 15:11:52 2013 -0700
+++ b/go/gogo.cc	Sat Sep 28 13:21:19 2013 -0700
@@ -1192,6 +1192,27 @@
   this-interface_types_.push_back(itype);
 }
 
+// Return an erroneous name that indicates that an error has already
+// been reported.
+
+std::string
+Gogo::erroneous_name()
+{
+  static int erroneous_count;
+  char name[50];
+  snprintf(name, sizeof name, $erroneous%d, erroneous_count);
+  ++erroneous_count;
+  return name;
+}
+
+// Return whether a name is an erroneous name.
+
+bool
+Gogo::is_erroneous_name(const std::string name)
+{
+  return name.compare(0, 10, $erroneous) == 0;
+}
+
 // Return a name for a thunk object.
 
 std::string
diff -r a8b1cc175cc1 go/gogo.h
--- a/go/gogo.h	Fri Sep 27 15:11:52 2013 -0700
+++ b/go/gogo.h	Sat Sep 28 13:21:19 2013 -0700
@@ -387,6 +387,16 @@
   void
   mark_locals_used();
 
+  // Return a name to use for an error case.  This should only be used
+  // after reporting an error, and is used to avoid useless knockon
+  // errors.
+  static std::string
+  erroneous_name();
+
+  // Return whether the name indicates an error.
+  static bool
+  is_erroneous_name(const std::string);
+
   // Return a name to use for a thunk function.  A thunk function is
   // one we create during the compilation, for a go statement or a
   // defer statement or a method expression.
diff -r a8b1cc175cc1 go/parse.cc
--- a/go/parse.cc	Fri Sep 27 15:11:52 2013 -0700
+++ b/go/parse.cc	Sat Sep 28 13:21:19 2013 -0700
@@ -213,7 +213,7 @@
   if (name == _)
 {
   error_at(this-location(), invalid use of %_%);
-  name = blank;
+  name = Gogo::erroneous_name();
 }
 
   if (package-name() == this-gogo_-package_name())
@@ -3104,7 +3104,7 @@
   if (token-identifier() == _)
 	{
 	  error_at(this-location(), invalid use of %_%);
-	  name = this-gogo_-pack_hidden_name(blank, false);
+	  name = Gogo::erroneous_name();
 	}
   this-advance_token();
   return Expression::make_selector(left, name, location);
@@ -4929,7 +4929,7 @@
 	{
 	  error_at(recv_var_loc,
 		   no new variables on left side of %:=%);
-	  recv_var = blank;
+	  recv_var = Gogo::erroneous_name();
 	}
 	  *is_send = false;
 	  *varname = gogo-pack_hidden_name(recv_var, is_rv_exported);
@@ -4965,7 +4965,7 @@
 		{
 		  error_at(recv_var_loc,
 			   no new variables on left side of %:=%);
-		  recv_var = blank;
+		  recv_var = Gogo::erroneous_name();
 		}
 		  *is_send = false;
 		  if (recv_var != _)
@@ -5502,7 +5502,7 @@
 	  if (name == _)
 	{
 	  error_at(this-location(), invalid package name _);
-	  name = blank;
+	  name = Gogo::erroneous_name();
 	}
 	  this-advance_token();
 	}
diff -r a8b1cc175cc1 go/types.cc
--- a/go/types.cc	Fri Sep 27 15:11:52 2013 -0700
+++ b/go/types.cc	Sat Sep 28 13:21:19 2013 -0700
@@ -9269,7 +9269,11 @@
 }
   else
 {
-  if (!ambig1.empty())
+  if (Gogo::is_erroneous_name(name))
+	{
+	  // An error was already reported.
+	}
+  else if (!ambig1.empty())
 	error_at(location, %qs is ambiguous via %qs and %qs,
 		 Gogo::message_name(name).c_str(), ambig1.c_str(),
 		 ambig2.c_str());


Re: Remove algo logic duplication Round 3

2013-09-28 Thread François Dumont

On 09/28/2013 02:45 AM, Paolo Carlini wrote:
.. by the way, in the current stl_algo* I'm still seeing many, many, 
functions which should be inline not declared as such: each function 
which has a few __glibcxx_requires* at the beginning (which normally 
boil down to nothing) and then forwards to a std::__* helper should be 
inline.




Fixed with the attached patch tested under Linux x86_64.

I also get your remark about the open round bracket, I didn't know that 
round bracket was the other name for parentheses ! I also fix the one 
you pointed me, I will be more careful next  time.


2013-09-28  François Dumont  fdum...@gcc.gnu.org

* include/bits/stl_algo.h (remove_copy, remove_copy_if): Declare
inline.
(rotate_copy, stable_partition, partial_sort_copy): Likewise.
(lower_bound, upper_bound, equal_range, inplace_merge): Likewise.
(includes, next_permutation, prev_permutation): Likewise.
(replace_copy, replace_copy_if, is_sorted_until): Likewise.
(minmax_element, is_permutation, adjacent_find): Likewise.
(count, count_if, search, search_n, merge): Likewise.
(set_intersection, set_difference): Likewise.
(set_symmetric_difference, min_element, max_element): Likewise.
* include/bits/stl_algobase.h (lower_bound): Likewise.
(lexicographical_compare, mismatch): Likewise.

I consider it trivial enough to commit it.

François

Index: include/bits/stl_algo.h
===
--- include/bits/stl_algo.h	(revision 203005)
+++ include/bits/stl_algo.h	(working copy)
@@ -661,7 +661,7 @@
*  are copied is unchanged.
   */
   templatetypename _InputIterator, typename _OutputIterator, typename _Tp
-_OutputIterator
+inline _OutputIterator
 remove_copy(_InputIterator __first, _InputIterator __last,
 		_OutputIterator __result, const _Tp __value)
 {
@@ -694,7 +694,7 @@
   */
   templatetypename _InputIterator, typename _OutputIterator,
 	   typename _Predicate
-_OutputIterator
+inline _OutputIterator
 remove_copy_if(_InputIterator __first, _InputIterator __last,
 		   _OutputIterator __result, _Predicate __pred)
 {
@@ -1414,9 +1414,8 @@
   __glibcxx_requires_valid_range(__first, __middle);
   __glibcxx_requires_valid_range(__middle, __last);
 
-  typedef typename iterator_traits_ForwardIterator::iterator_category
-	_IterType;
-  std::__rotate(__first, __middle, __last, _IterType());
+  std::__rotate(__first, __middle, __last,
+		std::__iterator_category(__first));
 }
 
   /**
@@ -1440,7 +1439,7 @@
*  for each @p n in the range @p [0,__last-__first).
   */
   templatetypename _ForwardIterator, typename _OutputIterator
-_OutputIterator
+inline _OutputIterator
 rotate_copy(_ForwardIterator __first, _ForwardIterator __middle,
 _ForwardIterator __last, _OutputIterator __result)
 {
@@ -1647,7 +1646,7 @@
*  relative ordering after calling @p stable_partition().
   */
   templatetypename _ForwardIterator, typename _Predicate
-_ForwardIterator
+inline _ForwardIterator
 stable_partition(_ForwardIterator __first, _ForwardIterator __last,
 		 _Predicate __pred)
 {
@@ -1733,7 +1732,7 @@
*  The value returned is @p __result_first+N.
   */
   templatetypename _InputIterator, typename _RandomAccessIterator
-_RandomAccessIterator
+inline _RandomAccessIterator
 partial_sort_copy(_InputIterator __first, _InputIterator __last,
 		  _RandomAccessIterator __result_first,
 		  _RandomAccessIterator __result_last)
@@ -1782,7 +1781,7 @@
   */
   templatetypename _InputIterator, typename _RandomAccessIterator,
 	   typename _Compare
-_RandomAccessIterator
+inline _RandomAccessIterator
 partial_sort_copy(_InputIterator __first, _InputIterator __last,
 		  _RandomAccessIterator __result_first,
 		  _RandomAccessIterator __result_last,
@@ -2016,7 +2015,7 @@
*  the function used for the initial sort.
   */
   templatetypename _ForwardIterator, typename _Tp, typename _Compare
-_ForwardIterator
+inline _ForwardIterator
 lower_bound(_ForwardIterator __first, _ForwardIterator __last,
 		const _Tp __val, _Compare __comp)
 {
@@ -2073,7 +2072,7 @@
*  @ingroup binary_search_algorithms
   */
   templatetypename _ForwardIterator, typename _Tp
-_ForwardIterator
+inline _ForwardIterator
 upper_bound(_ForwardIterator __first, _ForwardIterator __last,
 		const _Tp __val)
 {
@@ -2105,7 +2104,7 @@
*  the function used for the initial sort.
   */
   templatetypename _ForwardIterator, typename _Tp, typename _Compare
-_ForwardIterator
+inline _ForwardIterator
 upper_bound(_ForwardIterator __first, _ForwardIterator __last,
 		const _Tp __val, _Compare __comp)
 {
@@ -2179,7 +2178,7 @@
*  but does not actually call those functions.
   */
   templatetypename _ForwardIterator, typename _Tp
-pair_ForwardIterator, _ForwardIterator
+inline 

Re: [Patch, Darwin] update t-* and x-* fragments after switch to auto-deps.

2013-09-28 Thread Iain Sandoe
Hello Joseph,

On 28 Sep 2013, at 17:44, Iain Sandoe wrote:
 On 28 Sep 2013, at 17:40, Joseph S. Myers wrote:
 
 On Sat, 28 Sep 2013, Iain Sandoe wrote:
 
 * config/t-darwin (darwin.o, darwin-c.o, darwin-f.o, darwin-driver.o):
 Use COMPILE and POSTCOMPILE.
 * config/x-darwin (host-darwin.o): Likewise.
 * config/i386/x-darwin (host-i386-darwin.o): Likewise.
 * config/rs6000/x-darwin (host-ppc-darwin.o): Likewise.
 * config/rs6000/x-darwin64 (host-ppc64-darwin.o): Likewise.
 
 Do you need these compilation rules at all?  Or could you change 
 config.host to use paths such as config/host-darwin.o rather than just 
 host-darwin.o, and so allow the generic rules to be used (my understanding 
 was that the auto-deps patch series made lots of such changes to the 
 locations of .o files in the build tree to avoid needing special 
 compilation rules for particular files)?
 
 Good id, I'll investigate this.

I had a look at this, and it seems like a useful objective.  However, unless 
I'm missing a step, [following the template of config.gcc:out_file] it seem to 
require a fair amount of modification (introduction of common-object 
placeholders etc. in the configury and Makefile.in) - plus application and 
testing of this on multiple targets.  Not something I can realistically 
volunteer to do in the immediate future.

Therefore, I'm going to suggest keeping this patch 'as is' and following up 
later, when there is more time available, with a patch for the other 
modification.
Iain




Re: Add value range support into memcpy/memset expansion

2013-09-28 Thread Jan Hubicka
  Nice extension. Test cases would be great to have.
 Fore those you need i386 changes to actually use the info.  I will post that
 after some cleanup and additional testing.

Hi,
since I already caught your attention, here is the target specific part for
comments.

this patch implements memcpy/memset prologues and epilogues as suggested by
Ondrej Bilka.  His glibc implementation use IMO very smart trick with single
misaligned move to copy first N and last N bytes of the block.  The remainder
of the block is then copied by the usual loop that gets aligned to the proper
address.

This leads to partial memory stall, but that is handled well by modern x86
chips.

For example in the following testcase:
char *a;
char *b;
t1()
{
  memcpy (a,b,140);
}

We now produce:
movqb(%rip), %rsi
movqa(%rip), %rcx
movq(%rsi), %rax - first 8 bytes are moved
leaq8(%rcx), %rdi
andq$-8, %rdi   - dest is aligned
movq%rax, (%rcx)
movq132(%rsi), %rax  - last 8 bytes are moved
movq%rax, 132(%rcx)
subq%rdi, %rcx  - alignment is subtracted from count
subq%rcx, %rsi  - source is aligned
addl$140, %ecx  - normal copying of 8 byte chunks
shrl$3, %ecx
rep; movsq
ret

Instead of:
movqa(%rip), %rdi
movqb(%rip), %rsi
movl$140, %eax
testb   $1, %dil
jne .L28
testb   $2, %dil
jne .L29
.L3:
testb   $4, %dil
jne .L30
.L4:
movl%eax, %ecx
xorl%edx, %edx
shrl$3, %ecx
testb   $4, %al
rep movsq
je  .L5
movl(%rsi), %edx
movl%edx, (%rdi)
movl$4, %edx
.L5:
testb   $2, %al
je  .L6
movzwl  (%rsi,%rdx), %ecx
movw%cx, (%rdi,%rdx)
addq$2, %rdx
.L6:
testb   $1, %al
je  .L8
movzbl  (%rsi,%rdx), %eax
movb%al, (%rdi,%rdx)
.L8:
rep
ret
.p2align 4,,10
.p2align 3
.L28:
movzbl  (%rsi), %eax
addq$1, %rsi
movb%al, (%rdi)
addq$1, %rdi
movl$139, %eax
testb   $2, %dil
je  .L3
.p2align 4,,10
.p2align 3
.L29:
movzwl  (%rsi), %edx
subl$2, %eax
addq$2, %rsi
movw%dx, (%rdi)
addq$2, %rdi
testb   $4, %dil
je  .L4
.p2align 4,,10
.p2align 3
.L30:
movl(%rsi), %edx
subl$4, %eax
addq$4, %rsi
movl%edx, (%rdi)
addq$4, %rdi
jmp .L4

With the proposed value range code we can now take advantage of it even for
non-constant moves.  Somewhat artificial testcase:

char *p,*q;
t(unsigned int a)
{
  if (a8  a100)
memcpy(q,p,a);

}

Still generate pretty much same code (while -minline-all-stringops code on
mainline is just horrible):
leal-9(%rdi), %edx
movl%edi, %eax
cmpl$90, %edx
jbe .L5
rep; ret
.p2align 4,,10
.p2align 3
.L5:
movqp(%rip), %rsi
movqq(%rip), %rcx
movq(%rsi), %rdx
movq%rdx, (%rcx)
movl%edi, %edx
movq-8(%rsi,%rdx), %rdi
movq%rdi, -8(%rcx,%rdx)
leaq8(%rcx), %rdi
andq$-8, %rdi
subq%rdi, %rcx
subq%rcx, %rsi
addl%eax, %ecx
shrl$3, %ecx
rep; movsq
ret

Of course it is quite common to know only upper bound on the block.  In this 
case
we need to generate prologue for first few bytes:
char *p,*q;
t(unsigned int a)
{
  if (a100)
memcpy(q,p,a);

}
t:
.LFB0:
.cfi_startproc
cmpl$99, %edi
jbe .L15
.L7:
rep; ret
.p2align 4,,10
.p2align 3
.L15:
cmpl$8, %edi
movqq(%rip), %rdx
movqp(%rip), %rsi
jae .L3
testb   $4, %dil
jne .L16
testl   %edi, %edi
je  .L7
movzbl  (%rsi), %eax
testb   $2, %dil
movb%al, (%rdx)
je  .L7
movl%edi, %edi
movzwl  -2(%rsi,%rdi), %eax
movw%ax, -2(%rdx,%rdi)
ret
.p2align 4,,10
.p2align 3
.L3:
movq(%rsi), %rax
movq%rax, (%rdx)
movl%edi, %eax
movq-8(%rsi,%rax), %rcx
movq%rcx, -8(%rdx,%rax)
leaq8(%rdx), %rax
andq$-8, %rax
subq%rax, %rdx
addl%edx, %edi
subq%rdx, %rsi
shrl$3, %edi
movl%edi, %ecx
movq%rax, %rdi
rep; movsq
ret
.p2align 4,,10
.p2align 3
.L16:
movl(%rsi), %eax
movl%edi, %edi
movl%eax, (%rdx)
movl-4(%rsi,%rdi), %eax
movl  

Re: Add value range support into memcpy/memset expansion

2013-09-28 Thread Xinliang David Li
On Sat, Sep 28, 2013 at 3:05 PM, Jan Hubicka hubi...@ucw.cz wrote:
  Nice extension. Test cases would be great to have.
 Fore those you need i386 changes to actually use the info.  I will post that
 after some cleanup and additional testing.

 Hi,
 since I already caught your attention, here is the target specific part for
 comments.

 this patch implements memcpy/memset prologues and epilogues as suggested by
 Ondrej Bilka.  His glibc implementation use IMO very smart trick with single
 misaligned move to copy first N and last N bytes of the block.  The remainder
 of the block is then copied by the usual loop that gets aligned to the proper
 address.

 This leads to partial memory stall, but that is handled well by modern x86
 chips.

 For example in the following testcase:
 char *a;
 char *b;
 t1()
 {
   memcpy (a,b,140);
 }

 We now produce:
 movqb(%rip), %rsi
 movqa(%rip), %rcx
 movq(%rsi), %rax - first 8 bytes are moved
 leaq8(%rcx), %rdi
 andq$-8, %rdi   - dest is aligned
 movq%rax, (%rcx)
 movq132(%rsi), %rax  - last 8 bytes are moved
 movq%rax, 132(%rcx)
 subq%rdi, %rcx  - alignment is subtracted from count

 subq%rcx, %rsi  - source is aligned

This (source aligned) is not always true, but nevertheless the
sequence is very tight.

 addl$140, %ecx  - normal copying of 8 byte chunks
 shrl$3, %ecx
 rep; movsq
 ret

 Of course it is quite common to know only upper bound on the block.  In this 
 case
 we need to generate prologue for first few bytes:
 char *p,*q;
 t(unsigned int a)
 {
   if (a100)
 memcpy(q,p,a);

 }
 t:
 .LFB0:
 .cfi_startproc
 cmpl$99, %edi
 jbe .L15
 .L7:
 rep; ret
 .p2align 4,,10
 .p2align 3
 .L15:
 cmpl$8, %edi
 movqq(%rip), %rdx
 movqp(%rip), %rsi
 jae .L3
 testb   $4, %dil
 jne .L16
 testl   %edi, %edi
 je  .L7
 movzbl  (%rsi), %eax
 testb   $2, %dil
 movb%al, (%rdx)
 je  .L7
 movl%edi, %edi
 movzwl  -2(%rsi,%rdi), %eax
 movw%ax, -2(%rdx,%rdi)
 ret
 .p2align 4,,10
 .p2align 3
 .L3:
 movq(%rsi), %rax
 movq%rax, (%rdx)
 movl%edi, %eax
 movq-8(%rsi,%rax), %rcx
 movq%rcx, -8(%rdx,%rax)
 leaq8(%rdx), %rax
 andq$-8, %rax
 subq%rax, %rdx
 addl%edx, %edi
 subq%rdx, %rsi
 shrl$3, %edi
 movl%edi, %ecx
 movq%rax, %rdi
 rep; movsq
 ret
 .p2align 4,,10
 .p2align 3
 .L16:
 movl(%rsi), %eax
 movl%edi, %edi
 movl%eax, (%rdx)
 movl-4(%rsi,%rdi), %eax
 movl%eax, -4(%rdx,%rdi)
 ret
 .cfi_endproc
 .LFE0:

 Mainline would output a libcall here (because size is unknown to it) and with
 inlining all stringops it winds up 210 bytes of code instead of 142 bytes
 above.

 Unforutnately the following testcase:
 char *p,*q;
 t(int a)
 {
   if (a100)
 memcpy(q,p,a);

 }
 Won't get inlined.  This is because A is known to be smaller than 100 that
 results in anti range after conversion to size_t.  This anti range allows very
 large values (above INT_MAX) and thus we do not know the block size.
 I am not sure if the sane range can be recovered somehow.  If not, maybe
 this is common enough to add support for probable upper bound parameter to
 the template.

Do we know if there is real code that intentionally does that other
than security flaws as result of improperly done range check?

I think by default GCC should assume the memcpy size range is (0, 100)
here with perhaps an option to override it.

thanks,

David


 Use of value ranges makes it harder to choose proper algorithm since the 
 average
 size is no longer known.  For the moment I take simple average of lower and 
 upper
 bound, but this is wrong.

 Libcall starts to win only for pretty large blocks (over 4GB definitely) so 
 it makes
 sense to inline functions with range 04096 even though the cost tables 
 tells
 to expand libcall for everything bigger than 140 bytes:  if blocks are small 
 we will
 get noticeable win and if blocks are big, we won't lose much.

 I am considering assigning value ranges to the algorithms, too, for more sane
 choices in decide_alg.

 I also think the misaligned move trick can/should be performed by
 move_by_pieces and we ought to consider sane use of SSE - current vector_loop
 with unrolling factor of 4 seems bit extreme.  At least buldozer is happy with
 2 and I would expect SSE moves to be especially useful for moving blocks with
 known size where they are not used at all.

 Currently I disabled misaligned move prologues/epilogues for Michael's vector