[Patch, Darwin] update t-* and x-* fragments after switch to auto-deps.
Hi, This updates the Darwin port {t,x}-* fragments after the switch to auto-deps (thanks Tom!). bootstrapped (all langs, incl Ada) on i686-darwin9(bootstrap=gcc-4.8), i686-darwin10, x86-64-darwin11, x86_64-darwin12 (bootstrap=recent trunk) powerpc-darwin9 (c,c++,lto,objc,fortran) (bootstrap=gcc-4.0.1). OK? Iain gcc: * config/t-darwin (darwin.o, darwin-c.o, darwin-f.o, darwin-driver.o): Use COMPILE and POSTCOMPILE. * config/x-darwin (host-darwin.o): Likewise. * config/i386/x-darwin (host-i386-darwin.o): Likewise. * config/rs6000/x-darwin (host-ppc-darwin.o): Likewise. * config/rs6000/x-darwin64 (host-ppc64-darwin.o): Likewise. diff --git a/gcc/config/i386/x-darwin b/gcc/config/i386/x-darwin index f0196ba..4967d69 100644 --- a/gcc/config/i386/x-darwin +++ b/gcc/config/i386/x-darwin @@ -1,4 +1,3 @@ -host-i386-darwin.o : $(srcdir)/config/i386/host-i386-darwin.c \ - $(CONFIG_H) $(SYSTEM_H) coretypes.h hosthooks.h $(HOSTHOOKS_DEF_H) \ - config/host-darwin.h - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ +host-i386-darwin.o : $(srcdir)/config/i386/host-i386-darwin.c + $(COMPILE) $ + $(POSTCOMPILE) diff --git a/gcc/config/rs6000/x-darwin b/gcc/config/rs6000/x-darwin index 5672c69..9d92ef5 100644 --- a/gcc/config/rs6000/x-darwin +++ b/gcc/config/rs6000/x-darwin @@ -1,5 +1,3 @@ -host-ppc-darwin.o : $(srcdir)/config/rs6000/host-darwin.c \ - $(CONFIG_H) $(SYSTEM_H) coretypes.h hosthooks.h $(HOSTHOOKS_DEF_H) toplev.h \ - config/host-darwin.h $(DIAGNOSTIC_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \ - $(INCLUDES) $ -o $@ +host-ppc-darwin.o : $(srcdir)/config/rs6000/host-darwin.c + $(COMPILE) $ + $(POSTCOMPILE) diff --git a/gcc/config/rs6000/x-darwin64 b/gcc/config/rs6000/x-darwin64 index 921d555..0932771 100644 --- a/gcc/config/rs6000/x-darwin64 +++ b/gcc/config/rs6000/x-darwin64 @@ -1,5 +1,3 @@ -host-ppc64-darwin.o : $(srcdir)/config/rs6000/host-ppc64-darwin.c \ - $(CONFIG_H) $(SYSTEM_H) coretypes.h hosthooks.h $(HOSTHOOKS_DEF_H) toplev.h \ - config/host-darwin.h $(DIAGNOSTIC_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \ - $(INCLUDES) $ -o $@ +host-ppc64-darwin.o : $(srcdir)/config/rs6000/host-ppc64-darwin.c + $(COMPILE) $ + $(POSTCOMPILE) diff --git a/gcc/config/t-darwin b/gcc/config/t-darwin index fdd52c2..87d5df7 100644 --- a/gcc/config/t-darwin +++ b/gcc/config/t-darwin @@ -18,25 +18,19 @@ TM_H += $(srcdir)/config/darwin-sections.def -darwin.o: $(srcdir)/config/darwin.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ - $(TM_H) $(RTL_H) $(REGS_H) hard-reg-set.h $(REAL_H) insn-config.h \ - conditions.h insn-flags.h output.h insn-attr.h flags.h $(TREE_H) expr.h \ - reload.h function.h $(GGC_H) langhooks.h $(TARGET_H) $(TM_P_H) gt-darwin.h \ - config/darwin-sections.def - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ - $(srcdir)/config/darwin.c +darwin.o: $(srcdir)/config/darwin.c config/darwin-sections.def + $(COMPILE) $ + $(POSTCOMPILE) -darwin-c.o: $(srcdir)/config/darwin-c.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ - $(TM_H) $(CPPLIB_H) $(TREE_H) $(C_PRAGMA_H) $(TM_P_H) \ - incpath.h flags.h $(C_COMMON_H) $(C_TARGET_H) $(C_TARGET_DEF_H) $(CPP_INTERNAL_H) - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ - $(srcdir)/config/darwin-c.c $(PREPROCESSOR_DEFINES) +darwin-c.o: $(srcdir)/config/darwin-c.c + $(COMPILE) $(PREPROCESSOR_DEFINES) $ + $(POSTCOMPILE) -darwin-f.o: $(srcdir)/config/darwin-f.c $(CONFIG_H) $(SYSTEM_H) coretypes.h - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ - $(srcdir)/config/darwin-f.c $(PREPROCESSOR_DEFINES) -darwin-driver.o: $(srcdir)/config/darwin-driver.c \ - $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(GCC_H) opts.h - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ - $(srcdir)/config/darwin-driver.c +darwin-f.o: $(srcdir)/config/darwin-f.c + $(COMPILE) $ + $(POSTCOMPILE) + +darwin-driver.o: $(srcdir)/config/darwin-driver.c + $(COMPILE) $ + $(POSTCOMPILE) diff --git a/gcc/config/x-darwin b/gcc/config/x-darwin index f671d91..c6226c0 100644 --- a/gcc/config/x-darwin +++ b/gcc/config/x-darwin @@ -1,3 +1,3 @@ -host-darwin.o : $(srcdir)/config/host-darwin.c $(CONFIG_H) $(SYSTEM_H) \ - coretypes.h toplev.h config/host-darwin.h - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $ +host-darwin.o : $(srcdir)/config/host-darwin.c + $(COMPILE) $ + $(POSTCOMPILE)
[Patch, Darwin] improve cross-compiles.
I've been experimenting with the idea of building native crosses on my most capable machine, for the many variants of darwin we now have, and then using the older/slower hardware for test only. This has uncovered a few issues with cross/native cross flags etc. this patch adjusts the mh-darwin fragment to ensure (i) that PIE is disabled for gcc exes on Darwin hosts since it is incompatible with the current PCH implementation. (ii) that -mdynamic-no-pic is used for m32 hosts. … for crosses as well as bootstraps (and, also, for stage1 compilations when bootstrapping on a darwin host). OK for trunk? Iain config: * mh-darwin (BOOT_CFLAGS): Only add -mdynamic-no-pic for m32 hosts. (STAGE1_CFLAGS, STAGE1_LDFLAGS): New. Fix over-length lines and amend comments. diff --git a/config/mh-darwin b/config/mh-darwin index 19bf265..a039f20 100644 --- a/config/mh-darwin +++ b/config/mh-darwin @@ -1,7 +1,18 @@ # The -mdynamic-no-pic ensures that the compiler executable is built without # position-independent-code -- the usual default on Darwin. This fix speeds # compiles by 3-5%. -BOOT_CFLAGS += -mdynamic-no-pic +BOOT_CFLAGS += \ +`case ${host} in i?86-*-darwin* | powerpc-*-darwin*) \ + echo -mdynamic-no-pic ;; esac;` -# Ensure we don't try and use -pie, as it is incompatible with pch. -BOOT_LDFLAGS += `case ${host} in *-*-darwin[1][1-9]*) echo -Wl,-no_pie ;; esac;` +# ld on Darwin versions = 10.7 defaults to PIE executables. Disable this for +# gcc components, since it is incompatible with our pch implementation. +BOOT_LDFLAGS += \ +`case ${host} in *-*-darwin[1][1-9]*) echo -Wl,-no_pie ;; esac;` + +# Similarly, for cross-compilation. +STAGE1_CFLAGS += \ +`case ${host} in i?86-*-darwin* | powerpc-*-darwin*)\ + echo -mdynamic-no-pic ;; esac;` +STAGE1_LDFLAGS += \ +`case ${host} in *-*-darwin[1][1-9]*) echo -Wl,-no_pie ;; esac;`
[Patch, Darwin/ppc] Fix altivec dwarf reg sizes.
Hi! We have this cunning legacy scheme to support unwinding on both G3 and G4/G5 processors. Effectively, we build some components without altivec support, and then test for its presence at runtime. To doing this we pretend that altivec is absent when building init_unwind - and therefore all the altivec regs get a default size of 1 for dwarf purposes. This, naturally, breaks the dwarf unwinder for altivec cases (simd-3 and 4 fail, for example). I guess it didn't matter when originally authored, since STABS was the debug scheme then. Anyway, after considerable debate about this and several approaches, here is a patch that just ensures we set the altivec register size to its correct value. I've had this in my local tree for ~ 2years ... OK for trunk and open branches? (we're generating wrong code) Iain gcc: * config/rs6000/rs6000.c (rs6000_init_dwarf_reg_sizes_extra): Ensure that altivec registers are correctly sized on Darwin. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 7ff0af9..4e9a92b 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -28992,6 +28992,27 @@ rs6000_init_dwarf_reg_sizes_extra (tree address) emit_move_insn (adjust_address (mem, mode, offset), value); } } + + if (TARGET_MACHO ! TARGET_ALTIVEC) +{ + int i; + enum machine_mode mode = TYPE_MODE (char_type_node); + rtx addr = expand_expr (address, NULL_RTX, VOIDmode, EXPAND_NORMAL); + rtx mem = gen_rtx_MEM (BLKmode, addr); + rtx value = gen_int_mode (16, mode); + + /* On Darwin, libgcc may be built to run on both G3 and G4/5. +The unwinder still needs to know the size of Altivec registers. */ + + for (i = FIRST_ALTIVEC_REGNO; i LAST_ALTIVEC_REGNO+1; i++) + { + int column = DWARF_REG_TO_UNWIND_COLUMN (i); + HOST_WIDE_INT offset + = DWARF_FRAME_REGNUM (column) * GET_MODE_SIZE (mode); + + emit_move_insn (adjust_address (mem, mode, offset), value); + } +} } /* Map internal gcc register numbers to DWARF2 register numbers. */
Re: Context sensitive type inheritance graph walking
Hi, sorry it took me so long, but it also took me quite a while to chew through. Please consider posting context diff in cases like this. Nevertheless, most of the patch is a nice improvement. Uhm, sorry. Seems I diffed from a different users. There is cgraph_indirect_call_info that walks GIMPLE code and attempts to determine the context of a given call. It looks for objects located in declarations (static vars/ automatic vars/parameters), objects passed by invisible references and objects passed as THIS pointers. The second two cases are new, the first case is already done by gimple_extract_devirt_binfo_from_cst and I assume we should really put the context there, rather than reconstructing it from the edge. Of course we must stop overloading the offset field for that, are there any other obstacles? No, i think overloading of offset is the only obstackle. I just tried to keep the patch self-contained and do not dive into ipa-prop changes - it is long by itself. -/* See if BINFO's type match OTR_TYPE. If so, lookup method - in vtable of TYPE_BINFO and insert method to NODES array. +/* See if BINFO's type match OUTER_TYPE. If so, lookup + BINFO of subtype of TYPE at OFFSET and in that BINFO find + method in vtable and insert method to NODES array. Otherwise recurse to base BINFOs. This match what get_binfo_at_offset does, but with offset being unknown. This function now needs a comprehensive update of the leading comment, we have the offset, so it is known. I also dislike the name a lot because it does not record binfo, but extracts and records the call target from it. Can we call it something like record_target_from_binfo or similar? OK, record_target_from_binfo works for me. We still do not know full offset (from start of the type being walked) just partial offset within one of bases of the type. But I will try to formulate the comment better - it is indeed result of incremental updates. - if (types_same_for_odr (type, otr_type) - !pointer_set_insert (matched_vtables, BINFO_VTABLE (type_binfo))) + if (types_same_for_odr (type, outer_type)) { + tree inner_binfo = get_binfo_at_offset (type_binfo, + offset, otr_type); OK, get_binfo_at_offset also traverses BINFO_BASEs, I wonder whether we need to iterate over them and recurse when types_same_for_odr return false, with offset, won't get_binfo_at_offset just handle both cases correctly? No, it is the difference I described above. get_binfo_at_offset assume that the offset is from start of the BINFO's type it is given. This is not true here. We have derived_type that has outer_type as a base that has otr_type at offset inside. We do not know the offset in between derived_type and outer_type. This is why one function wraps the other. +/* Given REF call in FNDECL, determine class of the polymorphic + call (OTR_TYPE), its token (OTR_TOKEN) and CONTEXT. + Return pointer to object described by the context */ + The return value is never used, Is it ever going to be useful? Especially since it can be NULL even in useful cases... Yes, it is supposed to be used by ipa-prop. We return non-NULL when the base may be PARM_DECL that can be furhter propagated through. +tree +get_polymorphic_call_info (tree fndecl, + tree ref, + tree *otr_type, + HOST_WIDE_INT *otr_token, + ipa_polymorphic_call_context *context) +{ + tree base_pointer; + *otr_type = obj_type_ref_class (ref); + *otr_token = tree_low_cst (OBJ_TYPE_REF_TOKEN (ref), 1); + + /* Set up basic info in case we find nothing interesting in the analysis. */ + context-outer_type = *otr_type; + context-offset = 0; + base_pointer = OBJ_TYPE_REF_OBJECT (ref); + context-maybe_derived_type = true; + context-maybe_in_construction = false; + + /* Walk SSA for outer object. */ + do +{ + if (TREE_CODE (base_pointer) == SSA_NAME + !SSA_NAME_IS_DEFAULT_DEF (base_pointer) + SSA_NAME_DEF_STMT (base_pointer) + gimple_assign_single_p (SSA_NAME_DEF_STMT (base_pointer))) + { + base_pointer = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (base_pointer)); + STRIP_NOPS (base_pointer); If we want to put the context on the edges, we need to adjust the offset here. I do not follow here, strip nops should not alter offsets, right? + context-offset += offset2; + base_pointer = NULL; + /* Make very conservative assumption that all objects +may be in construction. +TODO: ipa-prop already contains code to tell better. +merge it later. */ + context-maybe_in_construction = true; + context-maybe_derived_type = false; + return base_pointer;
[Patch, Darwin/PPC] fix PR10901
Hi, this might be the oldest bug i've fixed so far. We currently generate wrong code for non-local gotos which breaks, amongst other things, nested functions. I fixed this a while ago for x86 Darwin and here is a version to fix it on PPC. (the patch is darwin-local save the definitions of the UNSPECs). this has been in my (and Dominique's) ppc tree for some time, OK for trunk? (and open branches?) - long-standing, wrong-code bug. Iain gcc: PR target/10901 * config/darwin-protos.h (machopic_get_function_picbase): New. * config/darwin.c (machopic_get_function_picbase): New. * config/rs6000/darwin.md (load_macho_picbase_si): Update picbase label for a new func. (load_macho_picbase_di): Likewise. (reload_macho_picbase): New expand. (reload_macho_picbase_si): New insn. (reload_macho_picbase_di): New insn. (nonlocal_goto_receiver): New define and split. * config/rs6000/rs6000.md (unspec enum): Add UNSPEC_RELD_MPIC. (unspecv enum): Add UNSPECV_NLGR. diff --git a/gcc/config/darwin-protos.h b/gcc/config/darwin-protos.h index 36d16b9..fe43ef3 100644 --- a/gcc/config/darwin-protos.h +++ b/gcc/config/darwin-protos.h @@ -26,6 +26,7 @@ extern void machopic_output_function_base_name (FILE *); extern const char *machopic_indirection_name (rtx, bool); extern const char *machopic_mcount_stub_name (void); extern bool machopic_should_output_picbase_label (void); +extern const char *machopic_get_function_picbase (void); #ifdef RTX_CODE diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c index ab48558..cb1bc38 100644 --- a/gcc/config/darwin.c +++ b/gcc/config/darwin.c @@ -405,6 +405,19 @@ machopic_output_function_base_name (FILE *file) fprintf (file, L%d$pb, current_pic_label_num); } +char curr_picbasename[32]; + +const char * +machopic_get_function_picbase (void) +{ + /* If dynamic-no-pic is on, we should not get here. */ + gcc_assert (!MACHO_DYNAMIC_NO_PIC_P); + + update_pic_label_number_if_needed (); + snprintf (curr_picbasename, 32, L%d$pb, current_pic_label_num); + return (const char *) curr_picbasename; +} + bool machopic_should_output_picbase_label (void) { diff --git a/gcc/config/rs6000/darwin.md b/gcc/config/rs6000/darwin.md index 24e8cfa..0fb2422 100644 --- a/gcc/config/rs6000/darwin.md +++ b/gcc/config/rs6000/darwin.md @@ -260,7 +260,10 @@ You should have received a copy of the GNU General Public License (unspec:SI [(match_operand:SI 0 immediate_operand s) (pc)] UNSPEC_LD_MPIC))] (DEFAULT_ABI == ABI_DARWIN) flag_pic - bcl 20,31,%0\\n%0: +{ + machopic_should_output_picbase_label (); /* Update for new func. */ + return bcl 20,31,%0\\n%0:; +} [(set_attr type branch) (set_attr length 4)]) @@ -269,7 +272,10 @@ You should have received a copy of the GNU General Public License (unspec:DI [(match_operand:DI 0 immediate_operand s) (pc)] UNSPEC_LD_MPIC))] (DEFAULT_ABI == ABI_DARWIN) flag_pic TARGET_64BIT - bcl 20,31,%0\\n%0: +{ + machopic_should_output_picbase_label (); /* Update for new func. */ + return bcl 20,31,%0\\n%0:; +} [(set_attr type branch) (set_attr length 4)]) @@ -370,3 +376,86 @@ You should have received a copy of the GNU General Public License } [(set_attr type branch,branch) (set_attr length 4,8)]) + +(define_expand reload_macho_picbase + [(set (reg:SI 65) +(unspec [(match_operand 0 )] + UNSPEC_RELD_MPIC))] + (DEFAULT_ABI == ABI_DARWIN) flag_pic +{ + if (TARGET_32BIT) +emit_insn (gen_reload_macho_picbase_si (operands[0])); + else +emit_insn (gen_reload_macho_picbase_di (operands[0])); + + DONE; +}) + +(define_insn reload_macho_picbase_si + [(set (reg:SI 65) +(unspec:SI [(match_operand:SI 0 immediate_operand s) + (pc)] UNSPEC_RELD_MPIC))] + (DEFAULT_ABI == ABI_DARWIN) flag_pic +{ + if (machopic_should_output_picbase_label ()) +{ + static char tmp[64]; + const char *cnam = machopic_get_function_picbase (); + snprintf (tmp, 64, bcl 20,31,%s\\n%s:\\n%%0:, cnam, cnam); + return tmp; +} + else +return bcl 20,31,%0\\n%0:; +} + [(set_attr type branch) + (set_attr length 4)]) + +(define_insn reload_macho_picbase_di + [(set (reg:DI 65) + (unspec:DI [(match_operand:DI 0 immediate_operand s) + (pc)] UNSPEC_RELD_MPIC))] + (DEFAULT_ABI == ABI_DARWIN) flag_pic TARGET_64BIT +{ + if (machopic_should_output_picbase_label ()) +{ + static char tmp[64]; + const char *cnam = machopic_get_function_picbase (); + snprintf (tmp, 64, bcl 20,31,%s\\n%s:\\n%%0:, cnam, cnam); + return tmp; +} + else +return bcl 20,31,%0\\n%0:; +} + [(set_attr type branch) + (set_attr length 4)]) + +;; We need to restore the PIC register, at the site of nonlocal label. + +(define_insn_and_split nonlocal_goto_receiver + [(unspec_volatile [(const_int 0)] UNSPECV_NLGR)]
Re: [RFC Patch, Aarch64] : Macros for profile code generation to enable gprof support
Hi Marcus, I have re-based the patch and tested for aarch64-none-elf with no regressions. Also for aarch64-unknown-linux-gnu the following test cases passes. Before: UNSUPPORTED: gcc.dg/nested-func-4.c UNSUPPORTED: gcc.dg/pr43643.c: UNSUPPORTED: gcc.dg/nest.c UNSUPPORTED: gcc.dg/20021014-1.c UNSUPPORTED: gcc.dg/pr32450.c UNSUPPORTED: g++.dg/other/profile1.C -std=gnu++98 UNSUPPORTED: g++.dg/other/profile1.C -std=gnu++11 After: --- PASS: gcc.dg/nested-func-4.c (test for excess errors) PASS: gcc.dg/nested-func-4.c execution test PASS: gcc.dg/pr43643.c (test for excess errors) PASS: gcc.dg/pr43643.c execution test PASS: gcc.dg/nest.c (test for excess errors) PASS: gcc.dg/nest.c execution test PASS: gcc.dg/20021014-1.c (test for excess errors) PASS: gcc.dg/20021014-1.c execution test PASS: gcc.dg/pr32450.c (test for excess errors) PASS: gcc.dg/pr32450.c execution test PASS: g++.dg/other/profile1.C -std=gnu++98 (test for excess errors) PASS: g++.dg/other/profile1.C -std=gnu++98 execution test PASS: g++.dg/other/profile1.C -std=gnu++11 (test for excess errors) PASS: g++.dg/other/profile1.C -std=gnu++11 execution test Please let me know if I can commit it to trunk, given that glibc patches are upstreamed. 2013-10-28 Venkataramanan Kumar venkataramanan.ku...@linaro.org * config/aarch64/aarch64.h (MCOUNT_NAME): Define. (NO_PROFILE_COUNTERS): Likewise. (PROFILE_HOOK): Likewise. (FUNCTION_PROFILER): Likewise. * config/aarch64/aarch64.c (aarch64_function_profiler): Remove. regards, Venkat. On 27 August 2013 13:05, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: Hi Venkat, On 3 August 2013 19:01, Venkataramanan Kumar venkataramanan.ku...@linaro.org wrote: This patch adds macros to support gprof in Aarch64. The difference from the previous patch is that the compiler, while generating mcount routine for an instrumented function, also passes the return address as argument. The mcount routine in glibc will be modified as follows. (-Snip-) #define MCOUNT \ -void __mcount (void) \ +void __mcount (void* frompc) \ { \ - mcount_internal ((u_long) RETURN_ADDRESS (1), (u_long) RETURN_ADDRESS (0)); \ + mcount_internal ((u_long) frompc, (u_long) RETURN_ADDRESS (0)); \ } (-Snip-) If this is Ok I will send the patch to glibc as well. 2013-08-02 Venkataramanan Kumar venkataramanan.ku...@linaro.org * config/aarch64/aarch64.h (MCOUNT_NAME): Define. (NO_PROFILE_COUNTERS): Likewise. (PROFILE_HOOK): Likewise. (FUNCTION_PROFILER): Likewise. * config/aarch64/aarch64.c (aarch64_function_profiler): Remove. . regards, Venkat. + emit_library_call (fun, LCT_NORMAL, VOIDmode, 1,lr,Pmode); \ +} GNU coding style requires spaces after the commas, but otherwise I have no further comments on this patch. Post the glibc patch please. Thanks /Marcus Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c(revision 202934) +++ gcc/config/aarch64/aarch64.c(working copy) @@ -3857,13 +3857,6 @@ output_addr_const (f, x); } -void -aarch64_function_profiler (FILE *f ATTRIBUTE_UNUSED, - int labelno ATTRIBUTE_UNUSED) -{ - sorry (function profiling); -} - bool aarch64_label_mentioned_p (rtx x) { Index: gcc/config/aarch64/aarch64.h === --- gcc/config/aarch64/aarch64.h(revision 202934) +++ gcc/config/aarch64/aarch64.h(working copy) @@ -783,9 +783,23 @@ #define PRINT_OPERAND_ADDRESS(STREAM, X) \ aarch64_print_operand_address (STREAM, X) -#define FUNCTION_PROFILER(STREAM, LABELNO) \ - aarch64_function_profiler (STREAM, LABELNO) +#define MCOUNT_NAME _mcount +#define NO_PROFILE_COUNTERS 1 + +/* Emit rtl for profiling. Output assembler code to FILE + to call _mcount for profiling a function entry. */ +#define PROFILE_HOOK(LABEL)\ +{ \ + rtx fun,lr; \ + lr = get_hard_reg_initial_val (Pmode, LR_REGNUM);\ + fun = gen_rtx_SYMBOL_REF (Pmode, MCOUNT_NAME); \ + emit_library_call (fun, LCT_NORMAL, VOIDmode, 1, lr, Pmode); \ +} + +/* All the work done in PROFILE_HOOK, but still required. */ +#define FUNCTION_PROFILER(STREAM, LABELNO) do { } while (0) + /* For some reason, the Linux headers think they know how to define these macros. They don't!!! */ #undef ASM_APP_ON Index: gcc/testsuite/lib/target-supports.exp === --- gcc/testsuite/lib/target-supports.exp (revision
Re: Add value range support into memcpy/memset expansion
Nice extension. Test cases would be great to have. Fore those you need i386 changes to actually use the info. I will post that after some cleanup and additional testing. Honza
RFA [testsuite]: New ARC target specific tests
This patch adds a number of tests for ARC target specific options. I'm a bit uncertain here if I still need approval for this patch. On the one hand the changes are all in an area that is normally within the remit of a target maintainer, and patch to add the gcc.target/arc directory has already been accepted. OTOH, the man body of the ARC port is still stuck waiting for review, so I'm still in the weird position of a target maintainer without an accepted target port. 2013-09-28 Simon Cook simon.c...@embecosm.com Joern Rennecke joern.renne...@embecosm.com * gcc.target/arc/barrel-shifter-1.c: New test. * gcc.target/arc/barrel-shifter-2.c: Likewise. * gcc.target/arc/long-calls.c, gcc.target/arc/mA6.c: Likewise. * gcc.target/arc/mA7.c, gcc.target/arc/mARC600.c: Likewise. * gcc.target/arc/mARC601.c, gcc.target/arc/mARC700.c: Likewise. * gcc.target/arc/mcpu-arc600.c, gcc.target/arc/mcpu-arc601.c: Likewise. * gcc.target/arc/mcpu-arc700.c, gcc.target/arc/mcrc.c: Likewise. * gcc.target/arc/mdpfp.c, gcc.target/arc/mdsp-packa.c: Likewise. * gcc.target/arc/mdvbf.c, gcc.target/arc/mlock.c: Likewise. * gcc.target/arc/mmac-24.c, gcc.target/arc/mmac-d16.c: Likewise. * gcc.target/arc/mno-crc.c, gcc.target/arc/mno-dsp-packa.c: Likewise. * gcc.target/arc/mno-dvbf.c, gcc.target/arc/mno-lock.c: Likewise. * gcc.target/arc/mno-mac-24.c, gcc.target/arc/mno-mac-d16.c: Likewise. * gcc.target/arc/mno-rtsc.c, gcc.target/arc/mno-swape.c: Likewise. * gcc.target/arc/mno-xy.c, gcc.target/arc/mrtsc.c: Likewise. * gcc.target/arc/mspfp.c, gcc.target/arc/mswape.c: Likewise. * gcc.target/arc/mtune-ARC600.c: Likewise. * gcc.target/arc/mtune-ARC601.c: Likewise. * gcc.target/arc/mtune-ARC700-xmac: Likewise. * gcc.target/arc/mtune-ARC700.c: Likewise. * gcc.target/arc/mtune-ARC725D.c: Likewise. * gcc.target/arc/mtune-ARC750D.c: Likewise. * gcc.target/arc/mul64.c, gcc.target/arc/mxy.c: Likewise. * gcc.target/arc/no-dpfp-lrsr.c: Likewise. diff --git a/gcc/testsuite/gcc.target/arc/barrel-shifter-1.c b/gcc/testsuite/gcc.target/arc/barrel-shifter-1.c new file mode 100644 index 000..a0eb6d7 --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/barrel-shifter-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -mcpu=ARC601 -mbarrel-shifter } */ +int i; + +int f (void) +{ + i = 2; +} + +/* { dg-final { scan-assembler asr_s } } */ diff --git a/gcc/testsuite/gcc.target/arc/barrel-shifter-2.c b/gcc/testsuite/gcc.target/arc/barrel-shifter-2.c new file mode 100644 index 000..97998fb --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/barrel-shifter-2.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +int i; + +int f (void) +{ + i = 2; +} + +/* { dg-final { scan-assembler asr_s } } */ diff --git a/gcc/testsuite/gcc.target/arc/long-calls.c b/gcc/testsuite/gcc.target/arc/long-calls.c new file mode 100644 index 000..63fafbc --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/long-calls.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -mlong-calls } */ + +int g (void); + +int f (void) +{ +g(); +} + +/* { dg-final { scan-assembler j @g } } */ diff --git a/gcc/testsuite/gcc.target/arc/mA6.c b/gcc/testsuite/gcc.target/arc/mA6.c new file mode 100644 index 000..2e15a86 --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/mA6.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options -mA6 } */ + +/* { dg-final { scan-assembler .cpu ARC600 } } */ diff --git a/gcc/testsuite/gcc.target/arc/mA7.c b/gcc/testsuite/gcc.target/arc/mA7.c new file mode 100644 index 000..c4430f4 --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/mA7.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options -mA7 } */ + +/* { dg-final { scan-assembler .cpu ARC700 } } */ diff --git a/gcc/testsuite/gcc.target/arc/mARC600.c b/gcc/testsuite/gcc.target/arc/mARC600.c new file mode 100644 index 000..20e086a --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/mARC600.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options -mARC600 } */ + +/* { dg-final { scan-assembler .cpu ARC600 } } */ diff --git a/gcc/testsuite/gcc.target/arc/mARC601.c b/gcc/testsuite/gcc.target/arc/mARC601.c new file mode 100644 index 000..1d30da4 --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/mARC601.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options -mARC601 } */ + +/* { dg-final { scan-assembler .cpu ARC601 } } */ diff --git a/gcc/testsuite/gcc.target/arc/mARC700.c b/gcc/testsuite/gcc.target/arc/mARC700.c new file mode 100644 index 000..43e9baa --- /dev/null +++ b/gcc/testsuite/gcc.target/arc/mARC700.c @@ -0,0 +1,4 @@ +/* { dg-do compile } */ +/* { dg-options -mARC700 } */ + +/* { dg-final { scan-assembler .cpu ARC700 } } */ diff --git a/gcc/testsuite/gcc.target/arc/mcpu-arc600.c b/gcc/testsuite/gcc.target/arc/mcpu-arc600.c new file mode 100644 index
Re: [Patch] Let ordinary escaping in POSIX regex be valid
On Fri, Sep 27, 2013 at 4:30 PM, Paolo Carlini paolo.carl...@oracle.com wrote: Nah, only double check that the testcase you are un-xfail-ing uses -std=gnu++11, otherwise will not pass ;) Committed :) Thanks! -- Tim Shen
Ping^6: contribute Synopsys Designware ARC port
The main part of the port (everything but the testsuite) is still waiting for review: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00323.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00324.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00325.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00328.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01870.html http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02070.html I've retested a i686-pc-linux-gnu native bootstrap as well as the obvious arc-elf32 / arc-linux-uclibc builds in trunk r202981.
Re: [PATCH] Trivial cleanup
On 09/27/2013 01:03 AM, Jeff Law wrote: On 09/26/2013 08:15 AM, Michael Matz wrote: Hi, On Wed, 25 Sep 2013, Jeff Law wrote: I was going to bring it up at some point too. My preference is strongly to simply eliminate the space on methods... Which wouldn't be so weird: in the libstdc++-v3 code we do it all the time. Yea. I actually reviewed the libstdc++ guidelines to see where they differed from GNU's C guidelines. I'm strongly in favor of dropping the horizontal whitespace between the method name and its open paren when the result is then dereferenced. ie foo.last()-e rather than foo.last ()-e. I'd prefer to not write in this style at all, like Jakub. If we must absolutely have it, then I agree that the space before _empty_ parentheses are ugly if followed by references. I.e. I'd like to see spaces before parens as is customary, except in one case: empty parens in the middle of expressions (which don't happen very often right now in GCC, and hence wouldn't introduce a coding style confusion): do.this (); give.that()-flag; get.list (one)-clear (); I'd prefer to not have further references to return values be applied, though (as in, the parentheses should be the end of statement), which would avoid the topic (at the expensive of having to invent names for those temporaries, or to write trivial wrapper methods contracting several method calls). Should we consider banning dereferencing the result of a method call and instead prefer to use a more functional interface such as Jakub has suggested, or have the result of the method call put into a temporary and dereference the temporary. I considered suggesting the latter. I wouldn't be a huge fan of the unnecessary temporaries, but they may be better than the horrid foo.last()-argh()-e-src or whatever. Stuffing the result into a temporary does have one advantage, it encourages us to CSE across the method calls in cases where the compiler might not be able to do so. Of course, being humans, we'll probably mess it up. jeff I don't like the more functional interface... I thought the suggestion might be a little tongue in cheek, but wasn't sure :-) I can't imagine the number of templates that would introduce... and the impact on compile/link time would probably not be trivial. temps would be OK with me, but there are a couple of concerns. - I'd want to be able to declare the temps at the point of use, not the top of the function. this would actually help with clarity I think. Not sure what the current coding standard says about that. - the compiler better do an awesome job of sharing stack space for user variables in a function... I wouldn't want to blow up the stack with a bazillion unrelatd temps each wit their own location. My example in this form would look something like: int unsignedsrcp = ptrvar.type().type().type_unsigned(); ... GimpleType t1 = ptrvar.type (); GimpleType t2 = t1.type (); int unsignedsrcp = t2.type.unsigned (); And yes, we'll probably introduce the odd human CSE error.. hopefully the test suite will catch them :-) I think I still prefer matz's suggestion, but I could be on board with this one too. some expressions are crazy complicated Andrew
[PATCH][RFC] fix reload causing ICE in subreg_get_info on m68k (PR58369)
This patch fixes PR58369, an ICE in subreg_get_info when compiling boost for m68k-linux. choose_reload_regs attempts to reload a DFmode (8-byte) reg, finds an XFmode (12-byte) reg in last_reg, and calls subreg_regno_offset with these two modes and a subreg offset of zero. However, this is not a correct lowpart subreg offset for big-endian and these two modes, so the lowpart subreg check in subreg_get_info fails, and the code continues to gcc_assert ((GET_MODE_SIZE (xmode) % GET_MODE_SIZE (ymode)) == 0); which fails because (12 % 8) != 0. choose_reload_regs passes the constant zero, in all cases where the reg isn't already a subreg, as the subreg offset to subreg_regno_offset, even though lowpart subregs on big-endian targets require an explicit offset computation. I think that is a bug. I believe other big-endian targets don't see this ICE because a) they define CANNOT_CHANGE_MODE_CLASS to reject differently-sized modes in floating-point registers (which prevents this path in choose_reload_regs), or b) their differently-sized modes are such that the size of a larger mode is a whole multiple of the size of the smaller mode (which allows the gcc_assert above to pass). This patch changes choose_reload_regs to call subreg_lowpart_offset to pass an endian-correct offset to subreg_regno_offset, except where the offset comes from a pre-existing subreg. [Defining CANNOT_CHANGE_MODE_CLASS appropriately for m68k also fixes the ICE, but I don't think the m68k backend really wants that, and I think it just papers over a generic bug.] Tested with trunk and 4.8 on {m68k,sparc64,powerpc64}-linux (big-endian), and on x86_64-linux/armv5tel-linux-gnueabi (little-endian). No regressions. Comments? Is this Ok for trunk? gcc/ 2013-09-28 Mikael Pettersson mikpeli...@gmail.com PR rtl-optimization/58369 * reload1.c (choose_reload_regs): Use subreg_lowpart_offset to pass endian-correct lowpart offset to subreg_regno_offset. --- gcc-4.9-20130922/gcc/reload1.c.~1~ 2013-09-09 15:07:10.0 +0200 +++ gcc-4.9-20130922/gcc/reload1.c 2013-09-28 16:24:21.068294912 +0200 @@ -6497,6 +6497,7 @@ choose_reload_regs (struct insn_chain *c if (inheritance) { int byte = 0; + bool byte_is_fixed = false; int regno = -1; enum machine_mode mode = VOIDmode; @@ -6519,7 +6520,10 @@ choose_reload_regs (struct insn_chain *c if (regno FIRST_PSEUDO_REGISTER) regno = subreg_regno (rld[r].in_reg); else - byte = SUBREG_BYTE (rld[r].in_reg); + { + byte = SUBREG_BYTE (rld[r].in_reg); + byte_is_fixed = true; + } mode = GET_MODE (rld[r].in_reg); } #ifdef AUTO_INC_DEC @@ -6557,6 +6561,8 @@ choose_reload_regs (struct insn_chain *c rtx last_reg = reg_last_reload_reg[regno]; i = REGNO (last_reg); + if (! byte_is_fixed) + byte = subreg_lowpart_offset (mode, GET_MODE (last_reg)); i += subreg_regno_offset (i, GET_MODE (last_reg), byte, mode); last_class = REGNO_REG_CLASS (i);
Re: [PATCH] Relax the requirement of reduction pattern in GCC vectorizer.
You can also add a test case of this form: int foo( int t, int n, int *dst) { int j = 0; int s = 1; t++; for (j = 0; j n; j++) { dst[j] = t; s *= t; } return s; } where without the fix the loop vectorization is missed. David On Fri, Sep 27, 2013 at 6:28 PM, Cong Hou co...@google.com wrote: The current GCC vectorizer requires the following pattern as a simple reduction computation: loop_header: a1 = phi a0, a2 a3 = ... a2 = operation (a3, a1) But a3 can also be defined outside of the loop. For example, the following loop can benefit from vectorization but the GCC vectorizer fails to vectorize it: int foo(int v) { int s = 1; ++v; for (int i = 0; i 10; ++i) s *= v; return s; } This patch relaxes the original requirement by also considering the following pattern: a3 = ... loop_header: a1 = phi a0, a2 a2 = operation (a3, a1) A test case is also added. The patch is tested on x86-64. thanks, Cong diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 39c786e..45c1667 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2013-09-27 Cong Hou co...@google.com + + * tree-vect-loop.c: Relax the requirement of the reduction + pattern so that one operand of the reduction operation can + come from outside of the loop. + 2013-09-25 Tom Tromey tro...@redhat.com * Makefile.in (PARTITION_H, LTO_SYMTAB_H, COMMON_TARGET_DEF_H) diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 09644d2..90496a2 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,7 @@ +2013-09-27 Cong Hou co...@google.com + + * gcc.dg/vect/vect-reduc-pattern-3.c: New test. + 2013-09-25 Marek Polacek pola...@redhat.com PR sanitizer/58413 diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 2871ba1..3c51c3b 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -2091,6 +2091,13 @@ vect_is_slp_reduction (loop_vec_info loop_info, gimple phi, gimple first_stmt) a3 = ... a2 = operation (a3, a1) + or + + a3 = ... + loop_header: + a1 = phi a0, a2 + a2 = operation (a3, a1) + such that: 1. operation is commutative and associative and it is safe to change the order of the computation (if CHECK_REDUCTION is true) @@ -2451,6 +2458,7 @@ vect_is_simple_reduction_1 (loop_vec_info loop_info, gimple phi, if (def2 def2 == phi (code == COND_EXPR || !def1 || gimple_nop_p (def1) + || !flow_bb_inside_loop_p (loop, gimple_bb (def1)) || (def1 flow_bb_inside_loop_p (loop, gimple_bb (def1)) (is_gimple_assign (def1) || is_gimple_call (def1) @@ -2469,6 +2477,7 @@ vect_is_simple_reduction_1 (loop_vec_info loop_info, gimple phi, if (def1 def1 == phi (code == COND_EXPR || !def2 || gimple_nop_p (def2) + || !flow_bb_inside_loop_p (loop, gimple_bb (def2)) || (def2 flow_bb_inside_loop_p (loop, gimple_bb (def2)) (is_gimple_assign (def2) || is_gimple_call (def2) diff --git gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-3.c gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-3.c new file mode 100644 index 000..06a9416 --- /dev/null +++ gcc/testsuite/gcc.dg/vect/vect-reduc-pattern-3.c @@ -0,0 +1,41 @@ +/* { dg-require-effective-target vect_int } */ + +#include stdarg.h +#include tree-vect.h + +#define N 10 +#define RES 1024 + +/* A reduction pattern in which there is no data ref in + the loop and one operand is defined outside of the loop. */ + +__attribute__ ((noinline)) int +foo (int v) +{ + int i; + int result = 1; + + ++v; + for (i = 0; i N; i++) +result *= v; + + return result; +} + +int +main (void) +{ + int res; + + check_vect (); + + res = foo (1); + if (res != RES) +abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect } } */ +/* { dg-final { cleanup-tree-dump vect } } */ +
Re: [Patch, Darwin] update t-* and x-* fragments after switch to auto-deps.
On Sat, 28 Sep 2013, Iain Sandoe wrote: * config/t-darwin (darwin.o, darwin-c.o, darwin-f.o, darwin-driver.o): Use COMPILE and POSTCOMPILE. * config/x-darwin (host-darwin.o): Likewise. * config/i386/x-darwin (host-i386-darwin.o): Likewise. * config/rs6000/x-darwin (host-ppc-darwin.o): Likewise. * config/rs6000/x-darwin64 (host-ppc64-darwin.o): Likewise. Do you need these compilation rules at all? Or could you change config.host to use paths such as config/host-darwin.o rather than just host-darwin.o, and so allow the generic rules to be used (my understanding was that the auto-deps patch series made lots of such changes to the locations of .o files in the build tree to avoid needing special compilation rules for particular files)? -- Joseph S. Myers jos...@codesourcery.com
Re: [Patch, Darwin] update t-* and x-* fragments after switch to auto-deps.
On 28 Sep 2013, at 17:40, Joseph S. Myers wrote: On Sat, 28 Sep 2013, Iain Sandoe wrote: * config/t-darwin (darwin.o, darwin-c.o, darwin-f.o, darwin-driver.o): Use COMPILE and POSTCOMPILE. * config/x-darwin (host-darwin.o): Likewise. * config/i386/x-darwin (host-i386-darwin.o): Likewise. * config/rs6000/x-darwin (host-ppc-darwin.o): Likewise. * config/rs6000/x-darwin64 (host-ppc64-darwin.o): Likewise. Do you need these compilation rules at all? Or could you change config.host to use paths such as config/host-darwin.o rather than just host-darwin.o, and so allow the generic rules to be used (my understanding was that the auto-deps patch series made lots of such changes to the locations of .o files in the build tree to avoid needing special compilation rules for particular files)? Good id, I'll investigate this. Iain
[PATCH] Fix bootstrap with java on multiarch systems
OK to backport the attached change to 4.7 and 4.8? Dave -- John David Anglin dave.ang...@bell.net 2013-09-28 John David Anglin dang...@gcc.gnu.org PR driver/58505 Backport from mainline: 2013-05-22 Matthias Klose d...@ubuntu.com * jvspec.c (jvgenmain_spec): Add %I to cc1 call. Index: jvspec.c === --- jvspec.c(revision 202859) +++ jvspec.c(working copy) @@ -59,7 +59,7 @@ jvgenmain %{findirect-dispatch} %{D*} %b %m.i |\n\ cc1 %m.i %1 \ %{!Q:-quiet} -dumpbase %b.c %{d*} %{m*}\ - %{g*} %{O*} \ + %{g*} %{O*} %I \ %{v:-version} %{pg:-p} %{p}\ %fbounds-check %fno-bounds-check\ %fassume-compiled* %fno-assume-compiled*\
Go patch committed: Avoid useless knockon errors for _
This patch to the Go compiler avoids useless knockon errors for invalid uses of the blank identifier _. I added a simple general facility for erroneous names although it is currently only used for _. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.8 branch. Ian diff -r a8b1cc175cc1 go/gogo-tree.cc --- a/go/gogo-tree.cc Fri Sep 27 15:11:52 2013 -0700 +++ b/go/gogo-tree.cc Sat Sep 28 13:21:19 2013 -0700 @@ -1061,6 +1061,12 @@ if (this-tree_ != NULL_TREE) return this-tree_; + if (Gogo::is_erroneous_name(this-name_)) +{ + this-tree_ = error_mark_node; + return error_mark_node; +} + tree name; if (this-classification_ == NAMED_OBJECT_TYPE) name = NULL_TREE; diff -r a8b1cc175cc1 go/gogo.cc --- a/go/gogo.cc Fri Sep 27 15:11:52 2013 -0700 +++ b/go/gogo.cc Sat Sep 28 13:21:19 2013 -0700 @@ -1192,6 +1192,27 @@ this-interface_types_.push_back(itype); } +// Return an erroneous name that indicates that an error has already +// been reported. + +std::string +Gogo::erroneous_name() +{ + static int erroneous_count; + char name[50]; + snprintf(name, sizeof name, $erroneous%d, erroneous_count); + ++erroneous_count; + return name; +} + +// Return whether a name is an erroneous name. + +bool +Gogo::is_erroneous_name(const std::string name) +{ + return name.compare(0, 10, $erroneous) == 0; +} + // Return a name for a thunk object. std::string diff -r a8b1cc175cc1 go/gogo.h --- a/go/gogo.h Fri Sep 27 15:11:52 2013 -0700 +++ b/go/gogo.h Sat Sep 28 13:21:19 2013 -0700 @@ -387,6 +387,16 @@ void mark_locals_used(); + // Return a name to use for an error case. This should only be used + // after reporting an error, and is used to avoid useless knockon + // errors. + static std::string + erroneous_name(); + + // Return whether the name indicates an error. + static bool + is_erroneous_name(const std::string); + // Return a name to use for a thunk function. A thunk function is // one we create during the compilation, for a go statement or a // defer statement or a method expression. diff -r a8b1cc175cc1 go/parse.cc --- a/go/parse.cc Fri Sep 27 15:11:52 2013 -0700 +++ b/go/parse.cc Sat Sep 28 13:21:19 2013 -0700 @@ -213,7 +213,7 @@ if (name == _) { error_at(this-location(), invalid use of %_%); - name = blank; + name = Gogo::erroneous_name(); } if (package-name() == this-gogo_-package_name()) @@ -3104,7 +3104,7 @@ if (token-identifier() == _) { error_at(this-location(), invalid use of %_%); - name = this-gogo_-pack_hidden_name(blank, false); + name = Gogo::erroneous_name(); } this-advance_token(); return Expression::make_selector(left, name, location); @@ -4929,7 +4929,7 @@ { error_at(recv_var_loc, no new variables on left side of %:=%); - recv_var = blank; + recv_var = Gogo::erroneous_name(); } *is_send = false; *varname = gogo-pack_hidden_name(recv_var, is_rv_exported); @@ -4965,7 +4965,7 @@ { error_at(recv_var_loc, no new variables on left side of %:=%); - recv_var = blank; + recv_var = Gogo::erroneous_name(); } *is_send = false; if (recv_var != _) @@ -5502,7 +5502,7 @@ if (name == _) { error_at(this-location(), invalid package name _); - name = blank; + name = Gogo::erroneous_name(); } this-advance_token(); } diff -r a8b1cc175cc1 go/types.cc --- a/go/types.cc Fri Sep 27 15:11:52 2013 -0700 +++ b/go/types.cc Sat Sep 28 13:21:19 2013 -0700 @@ -9269,7 +9269,11 @@ } else { - if (!ambig1.empty()) + if (Gogo::is_erroneous_name(name)) + { + // An error was already reported. + } + else if (!ambig1.empty()) error_at(location, %qs is ambiguous via %qs and %qs, Gogo::message_name(name).c_str(), ambig1.c_str(), ambig2.c_str());
Re: Remove algo logic duplication Round 3
On 09/28/2013 02:45 AM, Paolo Carlini wrote: .. by the way, in the current stl_algo* I'm still seeing many, many, functions which should be inline not declared as such: each function which has a few __glibcxx_requires* at the beginning (which normally boil down to nothing) and then forwards to a std::__* helper should be inline. Fixed with the attached patch tested under Linux x86_64. I also get your remark about the open round bracket, I didn't know that round bracket was the other name for parentheses ! I also fix the one you pointed me, I will be more careful next time. 2013-09-28 François Dumont fdum...@gcc.gnu.org * include/bits/stl_algo.h (remove_copy, remove_copy_if): Declare inline. (rotate_copy, stable_partition, partial_sort_copy): Likewise. (lower_bound, upper_bound, equal_range, inplace_merge): Likewise. (includes, next_permutation, prev_permutation): Likewise. (replace_copy, replace_copy_if, is_sorted_until): Likewise. (minmax_element, is_permutation, adjacent_find): Likewise. (count, count_if, search, search_n, merge): Likewise. (set_intersection, set_difference): Likewise. (set_symmetric_difference, min_element, max_element): Likewise. * include/bits/stl_algobase.h (lower_bound): Likewise. (lexicographical_compare, mismatch): Likewise. I consider it trivial enough to commit it. François Index: include/bits/stl_algo.h === --- include/bits/stl_algo.h (revision 203005) +++ include/bits/stl_algo.h (working copy) @@ -661,7 +661,7 @@ * are copied is unchanged. */ templatetypename _InputIterator, typename _OutputIterator, typename _Tp -_OutputIterator +inline _OutputIterator remove_copy(_InputIterator __first, _InputIterator __last, _OutputIterator __result, const _Tp __value) { @@ -694,7 +694,7 @@ */ templatetypename _InputIterator, typename _OutputIterator, typename _Predicate -_OutputIterator +inline _OutputIterator remove_copy_if(_InputIterator __first, _InputIterator __last, _OutputIterator __result, _Predicate __pred) { @@ -1414,9 +1414,8 @@ __glibcxx_requires_valid_range(__first, __middle); __glibcxx_requires_valid_range(__middle, __last); - typedef typename iterator_traits_ForwardIterator::iterator_category - _IterType; - std::__rotate(__first, __middle, __last, _IterType()); + std::__rotate(__first, __middle, __last, + std::__iterator_category(__first)); } /** @@ -1440,7 +1439,7 @@ * for each @p n in the range @p [0,__last-__first). */ templatetypename _ForwardIterator, typename _OutputIterator -_OutputIterator +inline _OutputIterator rotate_copy(_ForwardIterator __first, _ForwardIterator __middle, _ForwardIterator __last, _OutputIterator __result) { @@ -1647,7 +1646,7 @@ * relative ordering after calling @p stable_partition(). */ templatetypename _ForwardIterator, typename _Predicate -_ForwardIterator +inline _ForwardIterator stable_partition(_ForwardIterator __first, _ForwardIterator __last, _Predicate __pred) { @@ -1733,7 +1732,7 @@ * The value returned is @p __result_first+N. */ templatetypename _InputIterator, typename _RandomAccessIterator -_RandomAccessIterator +inline _RandomAccessIterator partial_sort_copy(_InputIterator __first, _InputIterator __last, _RandomAccessIterator __result_first, _RandomAccessIterator __result_last) @@ -1782,7 +1781,7 @@ */ templatetypename _InputIterator, typename _RandomAccessIterator, typename _Compare -_RandomAccessIterator +inline _RandomAccessIterator partial_sort_copy(_InputIterator __first, _InputIterator __last, _RandomAccessIterator __result_first, _RandomAccessIterator __result_last, @@ -2016,7 +2015,7 @@ * the function used for the initial sort. */ templatetypename _ForwardIterator, typename _Tp, typename _Compare -_ForwardIterator +inline _ForwardIterator lower_bound(_ForwardIterator __first, _ForwardIterator __last, const _Tp __val, _Compare __comp) { @@ -2073,7 +2072,7 @@ * @ingroup binary_search_algorithms */ templatetypename _ForwardIterator, typename _Tp -_ForwardIterator +inline _ForwardIterator upper_bound(_ForwardIterator __first, _ForwardIterator __last, const _Tp __val) { @@ -2105,7 +2104,7 @@ * the function used for the initial sort. */ templatetypename _ForwardIterator, typename _Tp, typename _Compare -_ForwardIterator +inline _ForwardIterator upper_bound(_ForwardIterator __first, _ForwardIterator __last, const _Tp __val, _Compare __comp) { @@ -2179,7 +2178,7 @@ * but does not actually call those functions. */ templatetypename _ForwardIterator, typename _Tp -pair_ForwardIterator, _ForwardIterator +inline
Re: [Patch, Darwin] update t-* and x-* fragments after switch to auto-deps.
Hello Joseph, On 28 Sep 2013, at 17:44, Iain Sandoe wrote: On 28 Sep 2013, at 17:40, Joseph S. Myers wrote: On Sat, 28 Sep 2013, Iain Sandoe wrote: * config/t-darwin (darwin.o, darwin-c.o, darwin-f.o, darwin-driver.o): Use COMPILE and POSTCOMPILE. * config/x-darwin (host-darwin.o): Likewise. * config/i386/x-darwin (host-i386-darwin.o): Likewise. * config/rs6000/x-darwin (host-ppc-darwin.o): Likewise. * config/rs6000/x-darwin64 (host-ppc64-darwin.o): Likewise. Do you need these compilation rules at all? Or could you change config.host to use paths such as config/host-darwin.o rather than just host-darwin.o, and so allow the generic rules to be used (my understanding was that the auto-deps patch series made lots of such changes to the locations of .o files in the build tree to avoid needing special compilation rules for particular files)? Good id, I'll investigate this. I had a look at this, and it seems like a useful objective. However, unless I'm missing a step, [following the template of config.gcc:out_file] it seem to require a fair amount of modification (introduction of common-object placeholders etc. in the configury and Makefile.in) - plus application and testing of this on multiple targets. Not something I can realistically volunteer to do in the immediate future. Therefore, I'm going to suggest keeping this patch 'as is' and following up later, when there is more time available, with a patch for the other modification. Iain
Re: Add value range support into memcpy/memset expansion
Nice extension. Test cases would be great to have. Fore those you need i386 changes to actually use the info. I will post that after some cleanup and additional testing. Hi, since I already caught your attention, here is the target specific part for comments. this patch implements memcpy/memset prologues and epilogues as suggested by Ondrej Bilka. His glibc implementation use IMO very smart trick with single misaligned move to copy first N and last N bytes of the block. The remainder of the block is then copied by the usual loop that gets aligned to the proper address. This leads to partial memory stall, but that is handled well by modern x86 chips. For example in the following testcase: char *a; char *b; t1() { memcpy (a,b,140); } We now produce: movqb(%rip), %rsi movqa(%rip), %rcx movq(%rsi), %rax - first 8 bytes are moved leaq8(%rcx), %rdi andq$-8, %rdi - dest is aligned movq%rax, (%rcx) movq132(%rsi), %rax - last 8 bytes are moved movq%rax, 132(%rcx) subq%rdi, %rcx - alignment is subtracted from count subq%rcx, %rsi - source is aligned addl$140, %ecx - normal copying of 8 byte chunks shrl$3, %ecx rep; movsq ret Instead of: movqa(%rip), %rdi movqb(%rip), %rsi movl$140, %eax testb $1, %dil jne .L28 testb $2, %dil jne .L29 .L3: testb $4, %dil jne .L30 .L4: movl%eax, %ecx xorl%edx, %edx shrl$3, %ecx testb $4, %al rep movsq je .L5 movl(%rsi), %edx movl%edx, (%rdi) movl$4, %edx .L5: testb $2, %al je .L6 movzwl (%rsi,%rdx), %ecx movw%cx, (%rdi,%rdx) addq$2, %rdx .L6: testb $1, %al je .L8 movzbl (%rsi,%rdx), %eax movb%al, (%rdi,%rdx) .L8: rep ret .p2align 4,,10 .p2align 3 .L28: movzbl (%rsi), %eax addq$1, %rsi movb%al, (%rdi) addq$1, %rdi movl$139, %eax testb $2, %dil je .L3 .p2align 4,,10 .p2align 3 .L29: movzwl (%rsi), %edx subl$2, %eax addq$2, %rsi movw%dx, (%rdi) addq$2, %rdi testb $4, %dil je .L4 .p2align 4,,10 .p2align 3 .L30: movl(%rsi), %edx subl$4, %eax addq$4, %rsi movl%edx, (%rdi) addq$4, %rdi jmp .L4 With the proposed value range code we can now take advantage of it even for non-constant moves. Somewhat artificial testcase: char *p,*q; t(unsigned int a) { if (a8 a100) memcpy(q,p,a); } Still generate pretty much same code (while -minline-all-stringops code on mainline is just horrible): leal-9(%rdi), %edx movl%edi, %eax cmpl$90, %edx jbe .L5 rep; ret .p2align 4,,10 .p2align 3 .L5: movqp(%rip), %rsi movqq(%rip), %rcx movq(%rsi), %rdx movq%rdx, (%rcx) movl%edi, %edx movq-8(%rsi,%rdx), %rdi movq%rdi, -8(%rcx,%rdx) leaq8(%rcx), %rdi andq$-8, %rdi subq%rdi, %rcx subq%rcx, %rsi addl%eax, %ecx shrl$3, %ecx rep; movsq ret Of course it is quite common to know only upper bound on the block. In this case we need to generate prologue for first few bytes: char *p,*q; t(unsigned int a) { if (a100) memcpy(q,p,a); } t: .LFB0: .cfi_startproc cmpl$99, %edi jbe .L15 .L7: rep; ret .p2align 4,,10 .p2align 3 .L15: cmpl$8, %edi movqq(%rip), %rdx movqp(%rip), %rsi jae .L3 testb $4, %dil jne .L16 testl %edi, %edi je .L7 movzbl (%rsi), %eax testb $2, %dil movb%al, (%rdx) je .L7 movl%edi, %edi movzwl -2(%rsi,%rdi), %eax movw%ax, -2(%rdx,%rdi) ret .p2align 4,,10 .p2align 3 .L3: movq(%rsi), %rax movq%rax, (%rdx) movl%edi, %eax movq-8(%rsi,%rax), %rcx movq%rcx, -8(%rdx,%rax) leaq8(%rdx), %rax andq$-8, %rax subq%rax, %rdx addl%edx, %edi subq%rdx, %rsi shrl$3, %edi movl%edi, %ecx movq%rax, %rdi rep; movsq ret .p2align 4,,10 .p2align 3 .L16: movl(%rsi), %eax movl%edi, %edi movl%eax, (%rdx) movl-4(%rsi,%rdi), %eax movl
Re: Add value range support into memcpy/memset expansion
On Sat, Sep 28, 2013 at 3:05 PM, Jan Hubicka hubi...@ucw.cz wrote: Nice extension. Test cases would be great to have. Fore those you need i386 changes to actually use the info. I will post that after some cleanup and additional testing. Hi, since I already caught your attention, here is the target specific part for comments. this patch implements memcpy/memset prologues and epilogues as suggested by Ondrej Bilka. His glibc implementation use IMO very smart trick with single misaligned move to copy first N and last N bytes of the block. The remainder of the block is then copied by the usual loop that gets aligned to the proper address. This leads to partial memory stall, but that is handled well by modern x86 chips. For example in the following testcase: char *a; char *b; t1() { memcpy (a,b,140); } We now produce: movqb(%rip), %rsi movqa(%rip), %rcx movq(%rsi), %rax - first 8 bytes are moved leaq8(%rcx), %rdi andq$-8, %rdi - dest is aligned movq%rax, (%rcx) movq132(%rsi), %rax - last 8 bytes are moved movq%rax, 132(%rcx) subq%rdi, %rcx - alignment is subtracted from count subq%rcx, %rsi - source is aligned This (source aligned) is not always true, but nevertheless the sequence is very tight. addl$140, %ecx - normal copying of 8 byte chunks shrl$3, %ecx rep; movsq ret Of course it is quite common to know only upper bound on the block. In this case we need to generate prologue for first few bytes: char *p,*q; t(unsigned int a) { if (a100) memcpy(q,p,a); } t: .LFB0: .cfi_startproc cmpl$99, %edi jbe .L15 .L7: rep; ret .p2align 4,,10 .p2align 3 .L15: cmpl$8, %edi movqq(%rip), %rdx movqp(%rip), %rsi jae .L3 testb $4, %dil jne .L16 testl %edi, %edi je .L7 movzbl (%rsi), %eax testb $2, %dil movb%al, (%rdx) je .L7 movl%edi, %edi movzwl -2(%rsi,%rdi), %eax movw%ax, -2(%rdx,%rdi) ret .p2align 4,,10 .p2align 3 .L3: movq(%rsi), %rax movq%rax, (%rdx) movl%edi, %eax movq-8(%rsi,%rax), %rcx movq%rcx, -8(%rdx,%rax) leaq8(%rdx), %rax andq$-8, %rax subq%rax, %rdx addl%edx, %edi subq%rdx, %rsi shrl$3, %edi movl%edi, %ecx movq%rax, %rdi rep; movsq ret .p2align 4,,10 .p2align 3 .L16: movl(%rsi), %eax movl%edi, %edi movl%eax, (%rdx) movl-4(%rsi,%rdi), %eax movl%eax, -4(%rdx,%rdi) ret .cfi_endproc .LFE0: Mainline would output a libcall here (because size is unknown to it) and with inlining all stringops it winds up 210 bytes of code instead of 142 bytes above. Unforutnately the following testcase: char *p,*q; t(int a) { if (a100) memcpy(q,p,a); } Won't get inlined. This is because A is known to be smaller than 100 that results in anti range after conversion to size_t. This anti range allows very large values (above INT_MAX) and thus we do not know the block size. I am not sure if the sane range can be recovered somehow. If not, maybe this is common enough to add support for probable upper bound parameter to the template. Do we know if there is real code that intentionally does that other than security flaws as result of improperly done range check? I think by default GCC should assume the memcpy size range is (0, 100) here with perhaps an option to override it. thanks, David Use of value ranges makes it harder to choose proper algorithm since the average size is no longer known. For the moment I take simple average of lower and upper bound, but this is wrong. Libcall starts to win only for pretty large blocks (over 4GB definitely) so it makes sense to inline functions with range 04096 even though the cost tables tells to expand libcall for everything bigger than 140 bytes: if blocks are small we will get noticeable win and if blocks are big, we won't lose much. I am considering assigning value ranges to the algorithms, too, for more sane choices in decide_alg. I also think the misaligned move trick can/should be performed by move_by_pieces and we ought to consider sane use of SSE - current vector_loop with unrolling factor of 4 seems bit extreme. At least buldozer is happy with 2 and I would expect SSE moves to be especially useful for moving blocks with known size where they are not used at all. Currently I disabled misaligned move prologues/epilogues for Michael's vector