Re: PATCH [x86_64] PR20020 - 128 bit structs not targeted to TImode
On Mon, Aug 13, 2012 at 09:20:32PM -0700, Gary Funck wrote: --- gcc/testsuite/gcc.dg/pr20020-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr20020-1.c (revision 0) @@ -0,0 +1,25 @@ +/* Target is restricted to x86_64 type architectures, + to check that 128-bit struct's are represented + as TImode values. */ +/* { dg-require-effective-target int128 } */ +/* { dg-do compile { target { x86_64-*-* } } } */ Given this all the testcases should go into gcc/testsuite/gcc.target/i386/ Jakub
[PATCH] Enable bbro for -Os
Hi, Basic block reordering is disabled for -Os from gcc 4.7 since the pass will lead to big code size regression. But benchmarks logs also show there are lots of regression due to poor code layout compared with 4.6. The patch is to enable bbro for -Os. When optimizing for size, it * avoid duplicating block. * keep its original order if there is no chance to fall through. * ignore edge frequency and probability. * handle predecessor first if its index is smaller to break long trace. * only connect Trace n with Trace n + 1 to reduce long jump. Here are the CSiBE code size benchmark results: * For ARM, code size reduces 0.21%. * For MIPS, code size reduces 0.25%. * For PPC, code size reduces 0.33%. * For X86, code size reduces 0.22%. The patch does not impact bbro when optimizing for speed. To verify it, I objdump -d all obj files from CSiBE (compiled with -O2) for ARM/MIPS/PPC/X86. The assembler with the patch is the same as it without the patch. No make check regression on ARM. Is it OK for trunk? Thanks! -Zhenqiang ChangeLog 2012-08-14 Zhenqiang Chen zhenqiang.c...@arm.com * bb-reorder.c (connect_better_edge_p): New added. (find_traces_1_round): When optimizing for size, ignore edge frequency and probability, and handle all in one round. (bb_to_key): Use bb-index as key for size. (better_edge_p): The smaller bb index is better for size. (connect_traces): Connect block n with block n + 1; connect trace m with trace m + 1 if falling through. (copy_bb_p): Avoid duplicating blocks. (gate_handle_reorder_blocks): Enable bbro when optimizing for -Os. Enable-bbro-for-size.patch Description: Binary data
Re: [Patch, fortran] PR46897 - [OOP] type-bound defined ASSIGNMENT(=) not used for derived type component in intrinsic assign
Dear Paul, Dear all, I tried to compile the check_compiler_for_memory_leaks.F90 file provided by Damian and it produces a segfault error. May be the problem is related with add_comp_ref. Regards Alessandro (from Malta) 2012/8/14 Paul Richard Thomas paul.richard.tho...@gmail.com Dear Mikael, I think there are a couple of bugs not triggered by the single component types in the test. See below. Yes, you are right. We should have tested multiple components... my fault! This could be moved to the only next caller (`previous' doesn't need to be updated if `this_code' is removed) to fix one usage of `this_code' :-). That's right... I'll make it so. ... but I have the feeling that this makes (*code) unreachable and that that's wrong. Shouldn't it be root-next = *code; ? No. That caused the regression that I mentioned. (*code) is resolved, at entry. resolve_code steps on to (*code)-next. if we do it after the typebound calls, we overwrite their job so we have to do it before. This is what is done. However, if we do it before, we also overwrite components to be assigned with a typebound call, and this can have some side effects as the LHS's argument can be INTENT(INOUT). This might be so but it is what the standard dictates should happen isn't it? Thanks for the review. I believe, in summary, that I should handle 'this_code' consistently so that multiple component defined assignments work correctly. I should also verify that pointers do what they are supposed to do, although it is rather trivial. Cheers Paul -- Dott. Alessandro Fanfarillo Verificatore Ellisse Cell: 339/2428012 Email: alessandro.fanfari...@gmail.com
Re: [PATCH, MIPS] 74k madd scheduler tweaks
Maxim Kuvyrkov ma...@codesourcery.com writes: I thought I'll butt in since I did a very similar thing for sync_memmodel a couple of months ago. Thanks. + /* We take care in instruction definitions to make sure accum_in operand is + a register_operand or [a more restrictive] muldiv_target_operand. */ + gcc_assert (REG_P (accum_in_op)); register_operand can accept subregs too. I think it'd be better to leave this bit out. diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index 759958b..79c1f25 100644 --- a/gcc/config/mips/mips.md +++ b/gcc/config/mips/mips.md @@ -275,6 +275,10 @@ (define_attr sync_memmodel (const_int 10)) +;; Accumulator operand for madd patterns. +(define_attr accum_in none,0,1,2,3,4,5 (const_string none)) + + Nit: just one blank line between attributes. @@ -1715,6 +1724,7 @@ ISA_HAS_MACC reload_completed macc\t%3,%1,%2 [(set_attr type imadd) + (set_attr accum_in 3) (set_attr mode SI)]) (define_insn *msac2 @@ -1729,6 +1739,7 @@ ISA_HAS_MSAC reload_completed msac\t%3,%1,%2 [(set_attr type imadd) + (set_attr accum_in 3) (set_attr mode SI)]) ;; Convert macc $0,r1,r2 mflo r3 into macc r3,r1,r2 These two should be 0 instead. OK with those changes, thanks. Richard
Re: [PATCH] Combine location with block using block_locations
Dehao Chen de...@google.com writes: Index: libcpp/line-map.c [...] + /* Data structure to associate an arbitrary data to a source location. */ + struct location_adhoc_data { + source_location locus; + void *data; + }; + + /* The following data structure encodes a location with some adhoc data, +and map it to a new unsigned integer, and replace it with the original I think you should remove the words it with. +location to represent the mapping. So it should read (so far): The following data structure encodes a location with some adhoc data and maps it to a new unsigned integer (called an adhoc location) that replaces the original location to represent the mapping. + +The new adhoc_loc uses the highest bit as the enabling bit, i.e. if the +highest bit is 1, then the number is adhoc_loc. Otherwise, it serves as +the original location. Once identified as the adhoc_loc, the lower 31 +bits of the integer is used to index to the location_adhoc_data array, s/index to/index/ +in which the locus and associated data is stored. */ + /* Combine LOCUS and DATA to a combined adhoc loc. */ + + source_location + get_combined_adhoc_loc (source_location locus, void *data) + { + struct location_adhoc_data lb; + struct location_adhoc_data **slot; + + linemap_assert (data); + + if (IS_ADHOC_LOC (locus)) + locus = location_adhoc_data[locus MAX_SOURCE_LOCATION].locus; + if (locus == 0 data == NULL) + return 0; + lb.locus = locus; + lb.data = data; + slot = (struct location_adhoc_data **) + htab_find_slot (location_adhoc_data_htab, lb, INSERT); + if (*slot == NULL) + { + *slot = location_adhoc_data + curr_adhoc_loc; + location_adhoc_data[curr_adhoc_loc] = lb; + if (++curr_adhoc_loc = allocated_location_adhoc_data) + { + char *orig_location_adhoc_data = (char *) location_adhoc_data; + allocated_location_adhoc_data *= 2; + location_adhoc_data = XRESIZEVEC (struct location_adhoc_data, + location_adhoc_data, + allocated_location_adhoc_data); + htab_traverse (location_adhoc_data_htab, location_adhoc_data_update, + orig_location_adhoc_data); + } + } I am wondering if there isn't an indentation issue here. + return ((*slot) - location_adhoc_data) | 0x8000; + } + Other than that, I don't really have anything worthwhile to say. I am deferring to the maintainers now :-) Thank you for bearing with me. -- Dodji
Re: Merge C++ conversion into trunk (5/6 - double_int rewrite)
On Mon, 13 Aug 2012, Lawrence Crowl wrote: On 8/13/12, Richard Guenther richard.guent...@gmail.com wrote: Increment/decrement operations did not exist, please do not add them at this point. Note that I have also added +=, -= and *= operations. Having them has three advantages. First, it matches expectations on what numeric types allow. Second, it results in more concise code. Third, it results in potentially faster code. I think we should be able to use those operators. When I run through changing call sites, I really want to change the sites to the final form, not do two passes. Ok. Thanks, Richard.
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
On Mon, Aug 13, 2012 at 9:39 PM, Igor Zamyatin izamya...@gmail.com wrote: Hi all! Patch aims to fix instability introduced by first scheduler on x86. In particular it targets following list: [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46843 [2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46829 [3] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36680 [4] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42295 Main idea of this activity is mostly to provide user a possibility to safely turn on first scheduler for his codes. In some cases this could positively affect performance, especially for in-order Atom. It would be great to hear some feedback from the community about the change. Maybe you can elaborate on this change? It's hard to reverse engineer what you try to do from the patch alone. Richard. Thanks in advance, Igor
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
On Mon, Aug 13, 2012 at 9:39 PM, Igor Zamyatin izamya...@gmail.com wrote: Patch aims to fix instability introduced by first scheduler on x86. In particular it targets following list: [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46843 [2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46829 [3] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36680 [4] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42295 Main idea of this activity is mostly to provide user a possibility to safely turn on first scheduler for his codes. In some cases this could positively affect performance, especially for in-order Atom. It would be great to hear some feedback from the community about the change. The 46829 failure is due to combine pass blindly propagating r8 into divmod instruction. This is invalid for divmod, which expects ax there. This can be fixed by introducing ax_register_operand predicate, but I think that combine shouldn't propagate hard registers. This blocks register allocator, and we have many passes that handle propagation of hard registers after IRA much more effectively. Various *_not_xmm0_* predicates were introduced just to fight this issue. They would be immediately obsolete with above combine change. IMO, preventing combine to propagate hard regs will fix 90% of spill failures on x86. Uros.
Re: Merge C++ conversion into trunk (3/6 - gengtype C++ support)
Hello Diego, Just some minor comments. Diego Novillo dnovi...@google.com a écrit: [...] +@section User-provided marking routines for template types +When a template type @code{TP} is marked with @code{GTY}, all +instances of that type are considered user-provided types. This means +that the individual instances of @code{TP} do not need to marked with s/to marked/to be marked/ +@code{GTY}. The user needs to provide template functions to mark all +the fields of the type. + +The following code snippets represent all the functions that need to +be provided. Note that type @code{TP} may reference to more than one +type. In these snippets, there is only one type @code{T}, but there +could be more. + +@smallexample +templatetypename T +void gt_ggc_mx (TPT *tp) +@{ Just for my education, for the marking routines in general why having the parameter tp be a pointer, rather than TPT tp ? + extern void gt_ggc_mx (T); + + /* This marks field 'fld' of type 'T'. */ + gt_ggc_mx (tp-fld); +@} [...] -- Dodji
Re: [Patch, fortran] PR46897 - [OOP] type-bound defined ASSIGNMENT(=) not used for derived type component in intrinsic assign
On 14/08/2012 07:03, Paul Richard Thomas wrote: However, if we do it before, we also overwrite components to be assigned with a typebound call, and this can have some side effects as the LHS's argument can be INTENT(INOUT). This might be so but it is what the standard dictates should happen isn't it? It dictates that the components should be assigned one by one (by either defined or intrinsic assignment), which I don't see as strictly equivalent to a whole structure assignment followed by typebound calls (for components needing it). Mikael
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
On Tue, Aug 14, 2012 at 10:36 AM, Uros Bizjak ubiz...@gmail.com wrote: Patch aims to fix instability introduced by first scheduler on x86. In particular it targets following list: [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46843 [2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46829 [3] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36680 [4] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42295 Main idea of this activity is mostly to provide user a possibility to safely turn on first scheduler for his codes. In some cases this could positively affect performance, especially for in-order Atom. It would be great to hear some feedback from the community about the change. The 46829 failure is due to combine pass blindly propagating r8 into divmod instruction. This is invalid for divmod, which expects ax there. This can be fixed by introducing ax_register_operand predicate, but I think that combine shouldn't propagate hard registers. This blocks register allocator, and we have many passes that handle propagation of hard registers after IRA much more effectively. Various *_not_xmm0_* predicates were introduced just to fight this issue. They would be immediately obsolete with above combine change. IMO, preventing combine to propagate hard regs will fix 90% of spill failures on x86. Probably, we can live with the check that propagated hard register really satisfies operand constraint. Jakub, do you have any opinion here? Uros.
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
On Tue, Aug 14, 2012 at 10:45:18AM +0200, Uros Bizjak wrote: The 46829 failure is due to combine pass blindly propagating r8 into divmod instruction. This is invalid for divmod, which expects ax there. This can be fixed by introducing ax_register_operand predicate, but I think that combine shouldn't propagate hard registers. This blocks register allocator, and we have many passes that handle propagation of hard registers after IRA much more effectively. Various *_not_xmm0_* predicates were introduced just to fight this issue. They would be immediately obsolete with above combine change. IMO, preventing combine to propagate hard regs will fix 90% of spill failures on x86. Probably, we can live with the check that propagated hard register really satisfies operand constraint. Jakub, do you have any opinion here? The combiner doesn't propagate likely spilled hard registers (when they appear in a pseudo = hard reg or hard reg = pseudo simple assignments), see cant_combine_insn_p. It does propagate other hard registers, and removing that could probably pessimize quite a lot of targets. Jakub
Re: [Patch, fortran] PR46897 - [OOP] type-bound defined ASSIGNMENT(=) not used for derived type component in intrinsic assign
On 14/08/2012 07:03, Paul Richard Thomas wrote: ... but I have the feeling that this makes (*code) unreachable and that that's wrong. Shouldn't it be root-next = *code; ? No. That caused the regression that I mentioned. (*code) is resolved, at entry. resolve_code steps on to (*code)-next. Yes, OK. Double pointers are really on the limits of my spirit. Mikael
Re: [google/gcc-4_7] Backport arm hardfp patch from trunk
OK for google/gcc-4_7. thanks Carrot On Tue, Aug 14, 2012 at 7:14 AM, Han Shen(沈涵) shen...@google.com wrote: Hi Carrot, could you take a look at this patch? Thanks! The modification is in upstream trunk patch revision - 186859. The same patch has been back ported to google/gcc-4_6 (http://codereview.appspot.com/6206055/), this is to apply on google/gcc-4_7 Regards, -Han 2012-08-13 Han Shen shen...@google.com Backport from mainline. 2012-05-01 Richard Earnshaw rearn...@arm.com * arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER_DEFAULT): Avoid ifdef comparing enumeration values. Update comments. 2012-04-26 Michael Hope michael.h...@linaro.org Richard Earnshaw rearn...@arm.com * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER_SOFT_FLOAT): Define. (GLIBC_DYNAMIC_LINKER_HARD_FLOAT): Define. (GLIBC_DYNAMIC_LINKER_DEFAULT): Define. (GLIBC_DYNAMIC_LINKER): Redefine to use the hard float path. diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h index c0cfde3..142054f 100644 --- a/gcc/config/arm/linux-eabi.h +++ b/gcc/config/arm/linux-eabi.h @@ -32,7 +32,8 @@ while (false) /* We default to a soft-float ABI so that binaries can run on all - target hardware. */ + target hardware. If you override this to use the hard-float ABI then + change the setting of GLIBC_DYNAMIC_LINKER_DEFAULT as well. */ #undef TARGET_DEFAULT_FLOAT_ABI #define TARGET_DEFAULT_FLOAT_ABI ARM_FLOAT_ABI_SOFT @@ -59,10 +60,25 @@ #undef SUBTARGET_EXTRA_LINK_SPEC #define SUBTARGET_EXTRA_LINK_SPEC -m TARGET_LINKER_EMULATION -/* Use ld-linux.so.3 so that it will be possible to run classic - GNU/Linux binaries on an EABI system. */ +/* GNU/Linux on ARM currently supports three dynamic linkers: + - ld-linux.so.2 - for the legacy ABI + - ld-linux.so.3 - for the EABI-derived soft-float ABI + - ld-linux-armhf.so.3 - for the EABI-derived hard-float ABI. + All the dynamic linkers live in /lib. + We default to soft-float, but this can be overridden by changing both + GLIBC_DYNAMIC_LINKER_DEFAULT and TARGET_DEFAULT_FLOAT_ABI. */ + #undef GLIBC_DYNAMIC_LINKER -#define GLIBC_DYNAMIC_LINKER RUNTIME_ROOT_PREFIX /lib/ld-linux.so.3 +#define GLIBC_DYNAMIC_LINKER_SOFT_FLOAT \ + RUNTIME_ROOT_PREFIX /lib/ld-linux.so.3 +#define GLIBC_DYNAMIC_LINKER_HARD_FLOAT \ + RUNTIME_ROOT_PREFIX /lib/ld-linux-armhf.so.3 +#define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_SOFT_FLOAT + +#define GLIBC_DYNAMIC_LINKER \ + %{mfloat-abi=hard: GLIBC_DYNAMIC_LINKER_HARD_FLOAT } \ +%{mfloat-abi=soft*: GLIBC_DYNAMIC_LINKER_SOFT_FLOAT } \ +%{!mfloat-abi=*: GLIBC_DYNAMIC_LINKER_DEFAULT } /* At this point, bpabi.h will have clobbered LINK_SPEC. We want to use the GNU/Linux version, not the generic BPABI version. */
[PATCH] Add update-ssa verification code
This adds verification code that we do not try to rewrite a symbol into SSA form that is already partly in SSA form. That would lead to silent wrong-code generation. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-08-14 Richard Guenther rguent...@suse.de * tree-into-ssa.c (update_ssa): Verify we do not rename symbols that are already partly in SSA form. Index: gcc/tree-into-ssa.c === --- gcc/tree-into-ssa.c (revision 190346) +++ gcc/tree-into-ssa.c (working copy) @@ -3247,6 +3247,30 @@ update_ssa (unsigned update_flags) statements and set local live-in information for the PHI placement heuristics. */ prepare_block_for_update (start_bb, insert_phi_p); + +#ifdef ENABLE_CHECKING + for (i = 1; i num_ssa_names; ++i) + { + tree name = ssa_name (i); + if (!name + || virtual_operand_p (name)) + continue; + + /* For all but virtual operands, which do not have SSA names +with overlapping life ranges, ensure that symbols marked +for renaming do not have existing SSA names associated with +them as we do not re-write them out-of-SSA before going +into SSA for the remaining symbol uses. */ + if (marked_for_renaming (SSA_NAME_VAR (name))) + { + fprintf (stderr, Existing SSA name for symbol marked for + renaming: ); + print_generic_expr (stderr, name, TDF_SLIM); + fprintf (stderr, \n); + internal_error (SSA corruption); + } + } +#endif } else {
[Patch, Fortran] PR54234 - Add -Wconversion warning for CMPLX(dp,dp)
This patch adds a -Wconversion warning (enabled also by -Wall) for CMPLX(real, real) if the real arguments have a higher kind number/precision as the default-kind of complex/real. I think most of the time, this precision loss is unintended; it can be silenced when using a kind= parameter (or -Wno-conversion). However, if you believe that the warning is not suitable for -Wall, we can also hide it by only enabling it with the talkative -Wconversion-extra flag. Build and regtested on x86-64-linux. OK for the trunk? Tobias 2012-08-14 Tobias Burnus bur...@net-b.de PR fortran/54234 * check.c (gfc_check_cmplx): Add -Wconversion warning when converting higher-precision REAL to default-precision CMPLX without kind= parameter. 2012-08-14 Tobias Burnus bur...@net-b.de PR fortran/54234 * gfortran.dg/warn_conversion_4.f90: New. diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index c5bf79b..2235b52 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -1278,6 +1278,17 @@ gfc_check_cmplx (gfc_expr *x, gfc_expr *y, gfc_expr *kind) if (kind_check (kind, 2, BT_COMPLEX) == FAILURE) return FAILURE; + if (!kind gfc_option.gfc_warn_conversion + x-ts.type == BT_REAL x-ts.kind gfc_default_real_kind) +gfc_warning_now (Conversion from %s to default-kind COMPLEX(%d) at %L + might loose precision, consider using the KIND argument, + gfc_typename (x-ts), gfc_default_real_kind, x-where); + else if (y !kind gfc_option.gfc_warn_conversion + y-ts.type == BT_REAL y-ts.kind gfc_default_real_kind) +gfc_warning_now (Conversion from %s to default-kind COMPLEX(%d) at %L + might loose precision, consider using the KIND argument, + gfc_typename (y-ts), gfc_default_real_kind, y-where); + return SUCCESS; } --- /dev/null 2012-08-08 07:41:43.631684108 +0200 +++ gcc/gcc/testsuite/gfortran.dg/warn_conversion_4.f90 2012-08-14 10:19:56.0 +0200 @@ -0,0 +1,18 @@ +! { dg-do compile } +! { dg-options -Wconversion } +! +! PR fortran/54234 +! +! +module fft_mod + implicit none + integer, parameter :: dp=kind(0.0d0) +contains + subroutine test +integer :: x +x = int (abs (cmplx(2.3,0.1))) +x = int (abs (cmplx(2.3_dp,0.1))) ! { dg-warning Conversion from REAL.8. to default-kind COMPLEX.4. at .1. might loose precision, consider using the KIND argument } +x = int (abs (cmplx(2.3,0.1_dp))) ! { dg-warning Conversion from REAL.8. to default-kind COMPLEX.4. at .1. might loose precision, consider using the KIND argument } +x = int (abs (cmplx(2.3_dp,0.1_dp))) ! { dg-warning Conversion from REAL.8. to default-kind COMPLEX.4. at .1. might loose precision, consider using the KIND argument } + end subroutine test +end module fft_mod --- /dev/null 2012-08-08 07:41:43.631684108 +0200 +++ gcc/gcc/testsuite/gfortran.dg/warn_conversion_4.f90 2012-08-14 10:19:56.0 +0200 @@ -0,0 +1,18 @@ +! { dg-do compile } +! { dg-options -Wconversion } +! +! PR fortran/54234 +! +! +module fft_mod + implicit none + integer, parameter :: dp=kind(0.0d0) +contains + subroutine test +integer :: x +x = int (abs (cmplx(2.3,0.1))) +x = int (abs (cmplx(2.3_dp,0.1))) ! { dg-warning Conversion from REAL.8. to default-kind COMPLEX.4. at .1. might loose precision, consider using the KIND argument } +x = int (abs (cmplx(2.3,0.1_dp))) ! { dg-warning Conversion from REAL.8. to default-kind COMPLEX.4. at .1. might loose precision, consider using the KIND argument } +x = int (abs (cmplx(2.3_dp,0.1_dp))) ! { dg-warning Conversion from REAL.8. to default-kind COMPLEX.4. at .1. might loose precision, consider using the KIND argument } + end subroutine test +end module fft_mod
Re: [Patch, Fortran] PR40881 - Add two F95 obsolescence warnings
On 08/09/2012 02:13 PM, Mikael Morin wrote: On 08/08/2012 19:12, Tobias Burnus wrote: With this patch, I think the only unimplemented obsolescence warning is for (8) Fixed form source -- see B.2.7. For the latter, I would like to see a possibility to silence that warning, given that there is substantial code around, which is in fixed form but otherwise a completely valid and obsolescent-free code. We could silence it with explicit -ffixed-form. That won't work. The driver (gfortran) automatically adds the flag when compiling .f files. Thus, from within the compile (f951) those are indistinguishable. Besides, many Makefiles have the same compiler flags for fixed and free form as (most) compilers automatically choose the right source form based on the file extension. Regarding the general design, I'm not sure it makes sense to distinguish between ST_LABEL_DO_TARGET and ST_LABEL_ENDDO_TARGET. I concur. I changed it and also added a comment to gfortran.h. @@ -3825,8 +3828,11 @@ parse_executable (gfc_statement st) case ST_NONE: unexpected_eof (); - case ST_FORMAT: case ST_DATA: + gfc_notify_std (GFC_STD_F95_OBS, DATA statement at %C after the + first executable statement); + /* Fall through. */ + case ST_FORMAT: case ST_ENTRY: case_executable: accept_statement (st); This diagnostic is more appropriate in verify_st_order (which needs to be called then). I disagree. Initially, I thought that verify_st_order is the right place - and discovered then that it doesn't get called after the first executable statement. Thus, I added it to parse_executable. Given that DATA is the only statement, which can also occur in the execution section and that its validity depends on the compile flags, it also would need a special handling in verify_st_order. Calling verify_st_order from parse_executable only for ST_DATA is kind of pointless while calling it always, leads to quite some overhead, requires that one keeps track of the previous state (which is required by verify_st_order but otherwise not needed in the execution section). Thus, I really prefer the current solution. case ST_LABEL_TARGET: + case ST_LABEL_ENDDO_TARGET: if (lp-referenced == ST_LABEL_FORMAT) gfc_error (Label %d at %C already referenced as a format label, labelno); else lp-defined = ST_LABEL_TARGET; I think it should be `lp-defined = type;' here. I think the current code is okay due to the required ordering, e.g. the termination label for a DO block has to come after the DO block. But I concur that using = type is cleaner. Thus, I removed ST_LABEL_ENDDO_TARGET, use =type and added a comment, but I didn't do the verify_st_order change. Build and regested on x86-64-linux. OK for the trunk? Tobias 2012-08-14 Tobias Burnus bur...@net-b.de PR fortran/40881 * error.c (gfc_notify_std): Reset cur_error_buffer-flag flag when the error/warning has been printed. * gfortran.h (gfc_sl_type): Add ST_LABEL_DO_TARGET. * match.c (gfc_match_do): Use ST_LABEL_DO_TARGET. * parse.c (check_statement_label): Use ST_LABEL_DO_TARGET. (parse_executable): Add obsolescence check for DATA. * resolve.c (resolve_branch): Handle ST_LABEL_DO_TARGET. * symbol.c (gfc_define_st_label, gfc_reference_st_label): Add obsolescence diagnostics. * trans-stmt.c (gfc_trans_label_assign): Handle ST_LABEL_DO_TARGET. 2012-08-14 Tobias Burnus bur...@net-b.de PR fortran/40881 * gfortran.dg/data_constraints_3.f90: New. * gfortran.dg/data_constraints_1.f90: Update dg-warning. * gfortran.dg/pr37243.f: Ditto. * gfortran.dg/g77/19990826-3.f: Ditto. * gfortran.dg/g77/20020307-1.f : Ditto. * gfortran.dg/g77/980310-3.f: Ditto. diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c index 7e968db..dde6a0f 100644 --- a/gcc/fortran/error.c +++ b/gcc/fortran/error.c @@ -875,6 +875,7 @@ gfc_notify_std (int std, const char *gmsgid, ...) warnings++; else gfc_increment_error_count(); + cur_error_buffer-flag = 0; } return (warning !warnings_are_errors) ? SUCCESS : FAILURE; diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index b6e2975..0e2130f 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -144,9 +144,11 @@ typedef enum { AR_FULL = 1, AR_ELEMENT, AR_SECTION, AR_UNKNOWN } ar_type; -/* Statement label types. */ +/* Statement label types. ST_LABEL_DO_TARGET is used for obsolescent warnings + related to shared DO terminations and DO targets which are neither END DO + nor CONTINUE; otherwise it is identical to ST_LABEL_TARGET. */ typedef enum -{ ST_LABEL_UNKNOWN = 1, ST_LABEL_TARGET, +{ ST_LABEL_UNKNOWN = 1, ST_LABEL_TARGET, ST_LABEL_DO_TARGET, ST_LABEL_BAD_TARGET, ST_LABEL_FORMAT } gfc_sl_type; diff --git a/gcc/fortran/match.c b/gcc/fortran/match.c index 737d6a3..5ab07e5 100644 ---
Re: [PATCH, ARM] Tuning for Cortex-M processors
I'm sorry the conversation about this patch went to private incidentally. Resend the key point. On Tue, Jul 24, 2012 at 8:40 PM, Julian Brown jul...@codesourcery.com wrote: On Mon, 23 Jul 2012 13:48:22 +0800 Ye Joey joey.ye...@gmail.com wrote: Since v7m and v6m are very different. It is high desired to have separate tuning for them. Namely: arm_cortex_v6m_tune and arm_cortex_v7m_tune. When created, they can share the same tuning though. I understand the point, but I'm not convinced it buys anything -- and furthermore it suggests that effort has been spent on tuning for v6m and v7m devices separately, which isn't really true (at least as far as I and/or CodeSourcery/Mentor are concerned). The proper time to split the tunings is when it is discovered that they need to diverge from each other, IMO. OK for now as I don't have a specific case in hand.
Re: [patch] Reduce memory overhead for large functions
On Mon, Aug 13, 2012 at 10:49 AM, Richard Guenther richard.guent...@gmail.com wrote: On Sun, Aug 12, 2012 at 11:49 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, This patch tried to use non-clearing memory allocation where possible. This is especially important for very large functions, when arrays of size in the order of n_basic_blocks or num_ssa_names are allocated to hold sparse data sets. For such cases the overhead of memset becomes measurable (and even dominant for the time spent in a pass in some cases, such as the one I recently fixed in ifcvt.c). This cuts off ~20% of the compile time for the test case of PR54146 at -O1. Not bad for a patch that basically only removes a bunch of memsets. I got another 5% for the changes in tree-ssa-loop-manip.c. A loop over an array with num_ssa_names there is expensive and unnecessary, and it helps to stuff all bitmaps together on a single obstack if you intend to blow them all away at the end (this could be done in a number of other places in the compiler). Clearing livein at the end of add_exit_phis_var also reduces peak memory with ~250MB at that point in the passes pipeline (only to blow up from ~1.5GB peak memory in the GIMPLE optimizers to ~3.6 GB in expand, and to ~8.6GB in IRA, but hey, who's counting? :-) Actually, the worst cases are not fixed with this patch. That'd be IRA (which consumes ~5GB on the test case, out of 8GB total), and tree-PRE. The IRA case looks like it may be hard to fix: Allocating multiple arrays of size O(max_regno) for every loop in init_loop_tree_node. The tree-PRE case is one where the avail arrays are allocated and cleared for every PRE candidate. This looks like a place where a pointer_map should be used instead. I'll tackle that later, when I've addressed more pressing problems in the compilation of the PR54146 test case. Hmm, or eaiser, use a vector of size (num_bb_preds) and index it by edge index. I have a patch for this. Richard. This patch was bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? Ok with adjusting the PRE comments according to the above. Thanks, Richard. Kudos to the compile farm people, without them I couldn't even hope to get any of this work done! Ciao! Steven
Re: [Patch, Fortran] PR40881 - Add two F95 obsolescence warnings
On 14/08/2012 11:33, Tobias Burnus wrote: Thus, I removed ST_LABEL_ENDDO_TARGET, use =type and added a comment, but I didn't do the verify_st_order change. Build and regested on x86-64-linux. OK for the trunk? OK, apart for: * gfortran.dg/data_constraints_1.f90: Update dg-warning. I don't see the need for the change, the ChangeLog doesn't match the patch, and it is different from the initial version. A forgotten local edit? Thanks, Mikael
[PATCH] Speed up PRE insertion
This removes the overhead of clearing a vector of n_basic_blocks elements per anti expression during insertion by making it a vector indexed by pred edge index and allocating it once for each basic block instead. This can make a significant difference for large functions. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2012-08-14 Richard Guenther rguent...@suse.de * tree-ssa-pre.c (do_regular_insertion): Use a VEC indexed by pred edge index for avail. (do_partial_partial_insertion): Likewise. (insert_into_preds_of_block): Adjust. Index: gcc/tree-ssa-pre.c === *** gcc/tree-ssa-pre.c (revision 190376) --- gcc/tree-ssa-pre.c (working copy) *** inhibit_phi_insertion (basic_block bb, p *** 3188,3194 static bool insert_into_preds_of_block (basic_block block, unsigned int exprnum, ! pre_expr *avail) { pre_expr expr = expression_for_id (exprnum); pre_expr newphi; --- 3188,3194 static bool insert_into_preds_of_block (basic_block block, unsigned int exprnum, ! VEC(pre_expr, heap) *avail) { pre_expr expr = expression_for_id (exprnum); pre_expr newphi; *** insert_into_preds_of_block (basic_block *** 3229,3235 gimple_seq stmts = NULL; tree builtexpr; bprime = pred-src; ! eprime = avail[bprime-index]; if (eprime-kind != NAME eprime-kind != CONSTANT) { --- 3229,3235 gimple_seq stmts = NULL; tree builtexpr; bprime = pred-src; ! eprime = VEC_index (pre_expr, avail, pred-dest_idx); if (eprime-kind != NAME eprime-kind != CONSTANT) { *** insert_into_preds_of_block (basic_block *** 3239,3252 type); gcc_assert (!(pred-flags EDGE_ABNORMAL)); gsi_insert_seq_on_edge (pred, stmts); ! avail[bprime-index] = get_or_alloc_expr_for_name (builtexpr); insertions = true; } else if (eprime-kind == CONSTANT) { /* Constants may not have the right type, fold_convert !should give us back a constant with the right type. ! */ tree constant = PRE_EXPR_CONSTANT (eprime); if (!useless_type_conversion_p (type, TREE_TYPE (constant))) { --- 3239,3252 type); gcc_assert (!(pred-flags EDGE_ABNORMAL)); gsi_insert_seq_on_edge (pred, stmts); ! VEC_replace (pre_expr, avail, pred-dest_idx, ! get_or_alloc_expr_for_name (builtexpr)); insertions = true; } else if (eprime-kind == CONSTANT) { /* Constants may not have the right type, fold_convert !should give us back a constant with the right type. */ tree constant = PRE_EXPR_CONSTANT (eprime); if (!useless_type_conversion_p (type, TREE_TYPE (constant))) { *** insert_into_preds_of_block (basic_block *** 3278,3288 } gsi_insert_seq_on_edge (pred, stmts); } ! avail[bprime-index] = get_or_alloc_expr_for_name (forcedexpr); } } else ! avail[bprime-index] = get_or_alloc_expr_for_constant (builtexpr); } } else if (eprime-kind == NAME) --- 3278,3290 } gsi_insert_seq_on_edge (pred, stmts); } ! VEC_replace (pre_expr, avail, pred-dest_idx, ! get_or_alloc_expr_for_name (forcedexpr)); } } else ! VEC_replace (pre_expr, avail, pred-dest_idx, !get_or_alloc_expr_for_constant (builtexpr)); } } else if (eprime-kind == NAME) *** insert_into_preds_of_block (basic_block *** 3321,3327 } gsi_insert_seq_on_edge (pred, stmts); } ! avail[bprime-index] = get_or_alloc_expr_for_name (forcedexpr); } } } --- 3323,3330 } gsi_insert_seq_on_edge (pred, stmts); } ! VEC_replace (pre_expr, avail, pred-dest_idx, ! get_or_alloc_expr_for_name (forcedexpr)); } } } *** insert_into_preds_of_block (basic_block *** 3344,3357 bitmap_set_bit (inserted_exprs, SSA_NAME_VERSION (gimple_phi_result (phi))); FOR_EACH_EDGE (pred, ei, block-preds) { ! pre_expr ae = avail[pred-src-index]; gcc_assert (get_expr_type
[patch] Use gcc_checking_assert in dominance.c
Hello, Checking overhead in dominance.c gives measurable compile time increases on a set of cc1-i files. Most of the checking should be done only with non-release checks enabled. Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? Ciao! Steven dom_checking_assert.diff Description: Binary data
Re: [Patch, Fortran] PR54234 - Add -Wconversion warning for CMPLX(dp,dp)
On 14/08/2012 11:33, Tobias Burnus wrote: This patch adds a -Wconversion warning (enabled also by -Wall) for CMPLX(real, real) if the real arguments have a higher kind number/precision as the default-kind of complex/real. I think most of the time, this precision loss is unintended; it can be silenced when using a kind= parameter (or -Wno-conversion). However, if you believe that the warning is not suitable for -Wall, we can also hide it by only enabling it with the talkative -Wconversion-extra flag. Build and regtested on x86-64-linux. OK for the trunk? Tobias Yes, thanks Mikael
Re: [patch] Use gcc_checking_assert in dominance.c
On Tue, Aug 14, 2012 at 12:03 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, Checking overhead in dominance.c gives measurable compile time increases on a set of cc1-i files. Most of the checking should be done only with non-release checks enabled. Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? Ok with combining asserts into one here + gcc_checking_assert (dir == CDI_DOMINATORS); + gcc_checking_assert (dom_computed[dir_index]); and here + gcc_checking_assert (dom_computed[dir_index]); + gcc_checking_assert (!bb-dom[dir_index]); Thanks, Richard.
[patch] Verify loop fathers
Hello, Verifying loop fathers now passes on powerpc64-unknown-linux-gnu. Also speed up fix_loop_structure by requiring DOM_OK so that fast DOM queries are available. Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? Ciao! Steven cfg_loop_father.diff Description: Binary data
Re: [PATCH] Remove basic_block-loop_depth
Richard Guenther wrote: Accessing loop_depth (bb-loop_father) isn't very expensive. The following removes the duplicate info in basic-blocks which is not properly kept up-to-date at the moment. Looks like this broke SPU build, since spu_machine_dependent_reorg accesses -loop_depth. According to comments in the code, this was done because of concerns that loop_father may no longer be set up this late in compilation, so I'm wondering whether just replacing this by loop_depth (bb-loop_father) would work here ... /* If this branch is a loop exit then propagate to previous fallthru block. This catches the cases when it is a simple loop or when there is an initial branch into the loop. */ if (prev (loop_exit || simple_loop) prev-loop_depth = bb-loop_depth) prop = prev; /* If there is only one adjacent predecessor. Don't propagate outside this loop. This loop_depth test isn't perfect, but I'm not sure the loop_father member is valid at this point. */ else if (prev single_pred_p (bb) prev-loop_depth == bb-loop_depth) prop = prev; Any suggestions? Thanks, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [Patch, Fortran] PR40881 - Add two F95 obsolescence warnings
On 08/14/2012 11:57 AM, Mikael Morin wrote: On 14/08/2012 11:33, Tobias Burnus wrote: Thus, I removed ST_LABEL_ENDDO_TARGET, use =type and added a comment, but I didn't do the verify_st_order change. Build and regested on x86-64-linux. OK for the trunk? OK, apart for: * gfortran.dg/data_constraints_1.f90: Update dg-warning. I don't see the need for the change, the ChangeLog doesn't match the patch, and it is different from the initial version. A forgotten local edit? No, just the wrong (?) solution to a real issue. The -pedantic flag causes the obsolescent warning. To avoid it, one can either remove the allocate or one uses ! { dg-options } I think the latter is cleaner. I will commit the patch with the latter and a fixed ChangeLog. Tobias
Re: [patch] Verify loop fathers
On Tue, Aug 14, 2012 at 12:07 PM, Steven Bosscher stevenb@gmail.com wrote: Hello, Verifying loop fathers now passes on powerpc64-unknown-linux-gnu. Also speed up fix_loop_structure by requiring DOM_OK so that fast DOM queries are available. Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? Ok. Thanks, Richard. Ciao! Steven
Re: [PATCH] Remove basic_block-loop_depth
On Tue, 14 Aug 2012, Ulrich Weigand wrote: Richard Guenther wrote: Accessing loop_depth (bb-loop_father) isn't very expensive. The following removes the duplicate info in basic-blocks which is not properly kept up-to-date at the moment. Looks like this broke SPU build, since spu_machine_dependent_reorg accesses -loop_depth. According to comments in the code, this was done because of concerns that loop_father may no longer be set up this late in compilation, so I'm wondering whether just replacing this by loop_depth (bb-loop_father) would work here ... Well, if loops are no longer set up (thus -loop_father is NULL) then the loop_depth information was stale and possibly wrong. /* If this branch is a loop exit then propagate to previous fallthru block. This catches the cases when it is a simple loop or when there is an initial branch into the loop. */ if (prev (loop_exit || simple_loop) prev-loop_depth = bb-loop_depth) prop = prev; /* If there is only one adjacent predecessor. Don't propagate outside this loop. This loop_depth test isn't perfect, but I'm not sure the loop_father member is valid at this point. */ else if (prev single_pred_p (bb) prev-loop_depth == bb-loop_depth) prop = prev; Any suggestions? If SPU md reorg would like to look at loop structures it should compute them. Simply call flow_loops_find, which hopefully works in CFG RTL mode (which I think is the mode available from md reorg?). I was simply throwing away loops after RTL loop optimizers not only because IRA for some weird reason decides to re-compute them in non-standard ways and because loop verification fails between ira / reload passes. So the other way would be to preserve loops for a longer period. Richard.
Re: [PATCH] Remove basic_block-loop_depth
On Tue, Aug 14, 2012 at 12:48 PM, Richard Guenther rguent...@suse.de wrote: If SPU md reorg would like to look at loop structures it should compute them. Simply call flow_loops_find, which hopefully works in CFG RTL mode (which I think is the mode available from md reorg?). No, the CFG is destroyed just before MD reorg, because most MD reorgs are not CFG aware. But some back-ends re-surrect it (bfin, ia64, c6x, etc...), and do compute loops (c6x at least). Ciao! Steven
Re: [PATCH] Remove basic_block-loop_depth
On Tue, 14 Aug 2012, Steven Bosscher wrote: On Tue, Aug 14, 2012 at 12:48 PM, Richard Guenther rguent...@suse.de wrote: If SPU md reorg would like to look at loop structures it should compute them. Simply call flow_loops_find, which hopefully works in CFG RTL mode (which I think is the mode available from md reorg?). No, the CFG is destroyed just before MD reorg, because most MD reorgs are not CFG aware. But some back-ends re-surrect it (bfin, ia64, c6x, etc...), and do compute loops (c6x at least). I suppose we could push down freeing the CFG to the individual MD reorgs and make the default MD reorg simply free the CFG ... Richard.
Re: [PATCH] Add update-ssa verification code
On Tue, 14 Aug 2012, Richard Guenther wrote: This adds verification code that we do not try to rewrite a symbol into SSA form that is already partly in SSA form. That would lead to silent wrong-code generation. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Ick, I've tested a different patch. Fix, bootstrapped and tested, installed. Richard. 2012-08-14 Richard Guenther rguent...@suse.de * tree-into-ssa.c: Include diagnostic-core.h. * Makefile.in (tree-into-ssa.o): Adjust. Index: gcc/tree-into-ssa.c === --- gcc/tree-into-ssa.c (revision 190379) +++ gcc/tree-into-ssa.c (working copy) @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. #include domwalk.h #include params.h #include vecprim.h +#include diagnostic-core.h /* This file builds the SSA form for a function as described in: Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 190379) +++ gcc/Makefile.in (working copy) @@ -2248,7 +2248,7 @@ tree-ssa.o : tree-ssa.c $(TREE_FLOW_H) $ $(GIMPLE_H) $(TREE_INLINE_H) $(TARGET_H) \ $(GIMPLE_PRETTY_PRINT_H) $(CFGLOOP_H) tree-into-ssa.o : tree-into-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ - $(TREE_H) $(TM_P_H) $(DIAGNOSTIC_H) \ + $(TREE_H) $(TM_P_H) $(DIAGNOSTIC_CORE_H) \ $(FUNCTION_H) $(TM_H) coretypes.h \ langhooks.h domwalk.h $(TREE_PASS_H) $(PARAMS_H) $(BASIC_BLOCK_H) \ $(BITMAP_H) $(CFGLOOP_H) $(FLAGS_H) $(HASHTAB_H) \
[PATCH][build] Fix PR54138, make --without-cloog work
This makes --without-cloog and --without-isl disable GRAPHITE support as intended. Tested up to building stage2 with --without-isl, verified ISL was not used or checked for, tested up to building stage2 without --without-isl, verified system ISL was picked up. Ok for trunk? Thanks, Richard. 2012-08-14 Richard Guenther rguent...@suse.de PR bootstrap/54138 * configure.ac: Re-organize ISL / CLOOG checks to allow disabling with either --without-isl or --without-cloog. * configure: Regenerated. * config/cloog.m4: Adjust. * config/isl.m4: Adjust. Index: configure.ac === *** configure.ac(revision 190376) --- configure.ac(working copy) *** AC_ARG_WITH(boot-ldflags, *** 1520,1563 fi]) AC_SUBST(poststage1_ldflags) ! # Check for ISL ! dnl Provide configure switches and initialize islinc isllibs ! dnl with user input. ! ISL_INIT_FLAGS ! if test x$with_isl != xno; then dnl The minimal version of ISL required for Graphite. ISL_CHECK_VERSION(0,10) - dnl Only execute fail-action, if ISL has been requested. ISL_IF_FAILED([ AC_MSG_ERROR([Unable to find a usable ISL. See config.log for details.])]) - fi ! # Check for CLOOG ! dnl Provide configure switches and initialize clooginc clooglibs ! dnl with user input. ! CLOOG_INIT_FLAGS ! if test x$isllibs = x test x$islinc = x; then ! clooglibs= ! clooginc= ! elif test x$with_cloog != xno; then ! dnl The minimal version of CLooG required for Graphite. ! dnl ! dnl If we use CLooG-Legacy, the provided version information is ! dnl ignored. ! CLOOG_CHECK_VERSION(0,17,0) ! ! dnl Only execute fail-action, if CLooG has been requested. ! CLOOG_IF_FAILED([ ! AC_MSG_ERROR([Unable to find a usable CLooG. See config.log for details.])]) fi # If either the ISL or the CLooG check failed, disable builds of in-tree # variants of both ! if test x$clooglibs = x test x$clooginc = x; then noconfigdirs=$noconfigdirs cloog isl fi # Check for LTO support. AC_ARG_ENABLE(lto, [AS_HELP_STRING([--enable-lto], [enable link time optimization support])], --- 1520,1590 fi]) AC_SUBST(poststage1_ldflags) ! # GCC GRAPHITE dependences, ISL and CLOOG which in turn requires ISL. ! # Basic setup is inlined here, actual checks are in config/cloog.m4 and ! # config/isl.m4 ! ! AC_ARG_WITH(cloog, ! [AS_HELP_STRING( ! [--with-cloog=PATH], ! [Specify prefix directory for the installed CLooG-ISL package. ! Equivalent to --with-cloog-include=PATH/include ! plus --with-cloog-lib=PATH/lib])]) ! AC_ARG_WITH(isl, ! [AS_HELP_STRING( ![--with-isl=PATH], ![Specify prefix directory for the installed ISL package. ! Equivalent to --with-isl-include=PATH/include ! plus --with-isl-lib=PATH/lib])]) ! ! # Treat either --without-cloog or --without-isl as a request to disable ! # GRAPHITE support and skip all following checks. ! if test x$with_isl != xno !test x$with_cloog != xno; then ! # Check for ISL ! dnl Provide configure switches and initialize islinc isllibs ! dnl with user input. ! ISL_INIT_FLAGS dnl The minimal version of ISL required for Graphite. ISL_CHECK_VERSION(0,10) dnl Only execute fail-action, if ISL has been requested. ISL_IF_FAILED([ AC_MSG_ERROR([Unable to find a usable ISL. See config.log for details.])]) ! if test x$gcc_cv_isl != xno; then ! # Check for CLOOG ! dnl Provide configure switches and initialize clooginc clooglibs ! dnl with user input. ! CLOOG_INIT_FLAGS ! dnl The minimal version of CLooG required for Graphite. ! dnl ! dnl If we use CLooG-Legacy, the provided version information is ! dnl ignored. ! CLOOG_CHECK_VERSION(0,17,0) ! ! dnl Only execute fail-action, if CLooG has been requested. ! CLOOG_IF_FAILED([ ! AC_MSG_ERROR([Unable to find a usable CLooG. See config.log for details.])]) ! fi fi # If either the ISL or the CLooG check failed, disable builds of in-tree # variants of both ! if test x$with_isl == xno || !test x$with_cloog == xno || !test x$gcc_cv_cloog = xno || !test x$gcc_cv_isl = xno; then noconfigdirs=$noconfigdirs cloog isl + islinc= + clooginc= + clooglibs= fi + AC_SUBST(islinc) + AC_SUBST(clooglibs) + AC_SUBST(clooginc) + + # Check for LTO support. AC_ARG_ENABLE(lto, [AS_HELP_STRING([--enable-lto], [enable link time optimization support])], Index: config/isl.m4 === *** config/isl.m4 (revision 190376) --- config/isl.m4 (working copy) *** *** 23,34 # Initialize isllibs/islinc according to the user input. AC_DEFUN([ISL_INIT_FLAGS], [ - AC_ARG_WITH(isl, - [AS_HELP_STRING( - [--with-isl=PATH], - [Specify prefix directory for the installed ISL package.
Re: [SH] PR 50751 - Add support for SH2A movu.b and movu.w insns
Oleg Endo oleg.e...@t-online.de wrote: This adds support for the SH2A instructions movu.b and movu.w for zero-extending mem loads with displacement addressing. Tested on rev 190332 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK? OK. Regards, kaz
Re: [SH] PR 52933 - Use div0s insn for integer sign comparisons
Oleg Endo oleg.e...@t-online.de wrote: This patch adds basic support for utilizing the SH div0s instruction to simplify some integer sign comparisons such as '(a 0) == (b 0)'. Tested on rev 190332 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK? OK. Regards, kaz
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
Hi Richard, These changes try to resolve the known problem with the first instruction scheduler for x86 platform. The main issue is the existence of hardware registers that are used for unloading of function arguments passing in HW registers and for passing function arguments in HW registers. 1. Unloading HW function argument registers. To prevent hoisting of instructions with virtual registers having bigger priority before unloading function arguments that may lead to a register spill problem in Register Allocation phase (all HW registers are busy) we set up the max priority all moves from function argument HW registers through ix86_adjust_priority hook. It means that all such instructions will be scheduled at the beginning of function. 2. Passing function arguments in HW registers. The main problem here is that backward copy propagation phase (aka combine instructions) can propagate HW argument registers to instructions evaluating argument values (e.g. issue#46829). To resolve this problem I decided to preserve an order of instructions writing to HW function argument registers through additional output dependencies between two adjacent instructions (ix86_dependencies_evaluation_hook). Hope my short explanation will help you to review my changes. Best regards. Yuri. 2012/8/14 Richard Guenther richard.guent...@gmail.com: On Mon, Aug 13, 2012 at 9:39 PM, Igor Zamyatin izamya...@gmail.com wrote: Hi all! Patch aims to fix instability introduced by first scheduler on x86. In particular it targets following list: [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46843 [2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46829 [3] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36680 [4] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42295 Main idea of this activity is mostly to provide user a possibility to safely turn on first scheduler for his codes. In some cases this could positively affect performance, especially for in-order Atom. It would be great to hear some feedback from the community about the change. Maybe you can elaborate on this change? It's hard to reverse engineer what you try to do from the patch alone. Richard. Thanks in advance, Igor
Re: PATCH [x86_64] PR20020 - 128 bit structs not targeted to TImode
On 08/14/12 08:30:59, Jakub Jelinek wrote: On Mon, Aug 13, 2012 at 09:20:32PM -0700, Gary Funck wrote: --- gcc/testsuite/gcc.dg/pr20020-1.c(revision 0) +++ gcc/testsuite/gcc.dg/pr20020-1.c(revision 0) @@ -0,0 +1,25 @@ +/* Target is restricted to x86_64 type architectures, + to check that 128-bit struct's are represented + as TImode values. */ +/* { dg-require-effective-target int128 } */ +/* { dg-do compile { target { x86_64-*-* } } } */ Given this all the testcases should go into gcc/testsuite/gcc.target/i386/ OK. Note: It might be possible to leave only dg-require-effective-target int128 and use this as a regression test for other targets, such as PPC64, S390, and IA64. However, I was uncertain if the RTL would be similar enough, and know that in at least one case the RTL scan would have to be adjusted. Also, I don't have access to a S390. If there is interest in generalizing the test, let me know. Otherwise, I'll move the tests to gcc.target/i386. - Gary
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
On Tue, Aug 14, 2012 at 1:51 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: 2. Passing function arguments in HW registers. The main problem here is that backward copy propagation phase (aka combine instructions) can propagate HW argument registers to instructions evaluating argument values (e.g. issue#46829). To resolve this problem I decided to preserve an order of instructions writing to HW function argument registers through additional output dependencies between two adjacent instructions (ix86_dependencies_evaluation_hook). Looking a bit deeper into PR46829 problem, it is actually the output of the divide instruction that gets combined with _called_ function argument (r8), before the call to bar function. So, this is similar, but separate issue from the propagation of function arguments into insn inputs. In any case, short of disabling propagation of hard registers, recog_for_combine should somehow check if the insn that combines hard regs is still valid. Uros.
Re: LEA-splitting improvement patch.
Hi Uros, Thanks a lot forr your comments. I prepared new patch and ChangeLog. Testing of x32 is in progress. It it OK for trunk? 2012-08-14 Yuri Rumyantsev ysrum...@gmail.com * config/i386/i386-protos.h (ix86_split_lea_for_addr) : Add additional argument. * config/i386/i386.md (ix86_split_lea_for_addr) : Add additional argument curr_insn. * config/i386/i386.c (ix86_split_lea_for_addr): Do instructions reodering to get opportunities for better scheduling. (ix86_lea_outperforms): Do more aggressive lea splitting. (find_nearest_reg-def): New function. Find nearest register definition used in address. 2012/8/14 Uros Bizjak ubiz...@gmail.com: Hello! It is known that LEA splitting is one of the most critical problems for Atom processors and changes try to improve it through: 1. More aggressive Lea splitting – do not perform splitting if only split cost exceeds AGU stall . 2. Reordering splitting instructions to get better scheduling – use the farthest defined register for SET instruction, then add constant offset if any and finally generate add instruction.This gives +0.5% speedup in geomean for eembc2.0 suite on Atom. All required testing was done – bootstraps for Atom Core2, make check. Note that this fix affects only on Atom processors. IMO, you should test LEA handling changes on x32 atom, too. With recent changes, you will get lots of zero_extended addresses through these functions, so I think it is worth to benchmark on x32 target. ChangeLog: 2012-08-08 Yuri Rumyantsev yuri.s.rumyant...@intel.com * config/i386/i386-protos.h (ix86_split_lea_for_addr) : Add additional argument. * config/i386/i386.md (ix86_splitt_lea_for_addr) : Add additional argument curr_insn. * config/i386/i386.c (find_nearest_reg-def): New function. Find nearest register definition used in address. (find_nearest_reg_def) (ix86_split_lea_for_addr) : Do more aggressive lea splitting and instructions reodering to get opportunities for better scheduling. Please merge entries for ix86_split_lea_for_addr. @@ -16954,7 +16954,7 @@ ix86_lea_outperforms (rtx insn, unsigned int regno0, unsigned int regno1, /* If there is no use in memory addess then we just check that split cost exceeds AGU stall. */ if (dist_use 0) -return dist_define = LEA_MAX_STALL; +return dist_define LEA_MAX_STALL; You didn't described this change in ChangeLog. Does this affect also affect benchmark speedup? +/* Return 0 if regno1 def is nearest to insn and 1 otherwise. */ Watch comment formatting and vertical spaces! +static int +find_nearest_reg_def (rtx insn, int regno1, int regno2) IMO, you can return 0, 1 and 2; with 0 when no definitions are found in the BB, 1 and 2 when either of two regnos are found. Otherwise, please use bool for function type. + if (insn_defines_reg (regno1, regno1, prev)) + return 0; + else if (insn_defines_reg (regno2, regno2, prev)) Please use INVALID_REGNUM as the second argument in the call to insn_defines_reg when looking for only one regno definition. { - emit_insn (gen_rtx_SET (VOIDmode, target, parts.base)); - tmp = parts.index; + rtx tmp1; + /* Try to give more opportunities to scheduler - + choose operand for move instruction with longer + distance from its definition to insn. */ (Hm, I don't think you mean gcc insn scheduler here.) + if (find_nearest_reg_def (insn, regno1, regno2) == 0) +{ + tmp = parts.index; /* choose index for move. */ + tmp1 = parts.base; +} + else + { + tmp = parts.base; + tmp1 = parts.index; + } + emit_insn (gen_rtx_SET (VOIDmode, target, tmp)); + if (parts.disp parts.disp != const0_rtx) +ix86_emit_binop (PLUS, mode, target, parts.disp); + ix86_emit_binop (PLUS, mode, target, tmp1); + return; } (Please use tabs instead of spaces in the added code.) However, this whole new part can be written simply as following (untested): { rtx tmp1; if (find_nearest_reg_def (insn, regno1, regno2) == 0) tmp1 = parts.base, tmp = parts.index; else tmp1 = parts.index, tmp = parts.base; emit_insn (gen_rtx_SET (VOIDmode, target, tmp1); } Please see how tmp is handled further down in the function. - ix86_emit_binop (PLUS, mode, target, tmp); + ix86_emit_binop (PLUS, mode, target, tmp); Please watch accidental whitespace changes. Uros. lea_split_improve.diff Description: Binary data
Re: [Patch, fortran] PR 47586 Missing deep copy when assigning from a function returning a pointer.
On 08/13/2012 04:32 PM, Mikael Morin wrote: here is a fix for PR47586: missing deep copy for the case: dt_w_alloc = ptr_func(arg) The patch set looks okay. I am not 100% sure how compatible your changes are with regards to finalization and coarray components, but I have the impression they don't make thinks worse. Regarding the comment: A data-pointer-returning function should be considered as a variable too. That's actually true according to the standard, which since F2008 allows: f() = 7 where f() returns a pointer. Well, gfortran doesn't support this yet and there are also some issues if the LHS is an operator expression (see current interpretation request discussions), but I thought I mention it for completeness. Thanks for the patch! - And for the thorough patch reviews! Tobias
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
On Tue, Aug 14, 2012 at 2:02 PM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Aug 14, 2012 at 1:51 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: 2. Passing function arguments in HW registers. The main problem here is that backward copy propagation phase (aka combine instructions) can propagate HW argument registers to instructions evaluating argument values (e.g. issue#46829). To resolve this problem I decided to preserve an order of instructions writing to HW function argument registers through additional output dependencies between two adjacent instructions (ix86_dependencies_evaluation_hook). Looking a bit deeper into PR46829 problem, it is actually the output of the divide instruction that gets combined with _called_ function argument (r8), before the call to bar function. So, this is similar, but separate issue from the propagation of function arguments into insn inputs. In any case, short of disabling propagation of hard registers, recog_for_combine should somehow check if the insn that combines hard regs is still valid. Yes, that sounds reasonable. Richard. Uros.
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
On Tue, Aug 14, 2012 at 02:40:42PM +0200, Richard Guenther wrote: On Tue, Aug 14, 2012 at 2:02 PM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Aug 14, 2012 at 1:51 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: 2. Passing function arguments in HW registers. The main problem here is that backward copy propagation phase (aka combine instructions) can propagate HW argument registers to instructions evaluating argument values (e.g. issue#46829). To resolve this problem I decided to preserve an order of instructions writing to HW function argument registers through additional output dependencies between two adjacent instructions (ix86_dependencies_evaluation_hook). Looking a bit deeper into PR46829 problem, it is actually the output of the divide instruction that gets combined with _called_ function argument (r8), before the call to bar function. So, this is similar, but separate issue from the propagation of function arguments into insn inputs. In any case, short of disabling propagation of hard registers, recog_for_combine should somehow check if the insn that combines hard regs is still valid. Yes, that sounds reasonable. What kind of checks do you have in mind though? Combiner already tries to recog the insn, which involves testing all the predicates on the insn. What else it could do? You mean that for hard registers it should check constraints too? Jakub
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
On Tue, Aug 14, 2012 at 2:45 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Aug 14, 2012 at 02:40:42PM +0200, Richard Guenther wrote: On Tue, Aug 14, 2012 at 2:02 PM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Aug 14, 2012 at 1:51 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: 2. Passing function arguments in HW registers. The main problem here is that backward copy propagation phase (aka combine instructions) can propagate HW argument registers to instructions evaluating argument values (e.g. issue#46829). To resolve this problem I decided to preserve an order of instructions writing to HW function argument registers through additional output dependencies between two adjacent instructions (ix86_dependencies_evaluation_hook). Looking a bit deeper into PR46829 problem, it is actually the output of the divide instruction that gets combined with _called_ function argument (r8), before the call to bar function. So, this is similar, but separate issue from the propagation of function arguments into insn inputs. In any case, short of disabling propagation of hard registers, recog_for_combine should somehow check if the insn that combines hard regs is still valid. Yes, that sounds reasonable. What kind of checks do you have in mind though? Combiner already tries to recog the insn, which involves testing all the predicates on the insn. What else it could do? You mean that for hard registers it should check constraints too? Yes. Do the same checks reload would do. Richard. Jakub
Re: [PATCH][RFC] Fixing instability of -fschedule-insns for x86
On Tue, Aug 14, 2012 at 2:45 PM, Jakub Jelinek ja...@redhat.com wrote: 2. Passing function arguments in HW registers. The main problem here is that backward copy propagation phase (aka combine instructions) can propagate HW argument registers to instructions evaluating argument values (e.g. issue#46829). To resolve this problem I decided to preserve an order of instructions writing to HW function argument registers through additional output dependencies between two adjacent instructions (ix86_dependencies_evaluation_hook). Looking a bit deeper into PR46829 problem, it is actually the output of the divide instruction that gets combined with _called_ function argument (r8), before the call to bar function. So, this is similar, but separate issue from the propagation of function arguments into insn inputs. In any case, short of disabling propagation of hard registers, recog_for_combine should somehow check if the insn that combines hard regs is still valid. Yes, that sounds reasonable. What kind of checks do you have in mind though? Combiner already tries to recog the insn, which involves testing all the predicates on the insn. What else it could do? You mean that for hard registers it should check constraints too? Yes, with extract_insn perhaps. Uros.
Re: Merge C++ conversion into trunk (3/6 - gengtype C++ support)
On 12-08-14 01:39 , Laurynas Biveinis wrote: (walk_type): Set D-IN_PTR_FILED when walking a TYPE_POINTER. FIELD Done. +fields is completely handled by user-provided routines. Section +@ref{User GC} for details on what functions need to be provided. See Section ... ? Done. +code is generated. For these types, the user is required to provide +three functions: one to act as a marker for garbage collection, and +two functions to act as marker and pointer walking for pre-compiled +headers. s/walking/walker ? Done. +In general, each marker @code{M} should call @code{M} for every +pointer field in the structure. Fields that are not allocated in GC +or are not pointers can be ignored. must be ignored Done. +create_user_defined_type (const char *type_name, struct fileloc *pos) ... + template by preteding that each type is a field of TY. This is needed to pretending Done. @@ -548,20 +603,30 @@ resolve_typedef (const char *s, struct fileloc *pos) for (p = typedefs; p != NULL; p = p-next) if (strcmp (p-name, s) == 0) return p-type; - error_at_line (pos, unidentified type `%s', s); - return scalar_nonchar; /* treat as int */ + + /* If we did not find a typedef registered, assume this is a name + for a user-defined type which will need to provide its own + marking functions. + + FIXME cxx-conversion. Emit an error once explicit annotations + for marking user types are implemented. */ + return create_user_defined_type (s, pos); Are explicit annotations for marking user types referring to GTY((user))? Actually, the comment is wrong. When we don't recognize the type, we simply consider this type as implicitly user-defined. +static const char * +filter_type_name (const char *type_name) +{ Maybe this function should return const-less char *? The casts to cast the const away for freeing it look a bit awkward. Done. + +/* User-callable entry point for marking string X. */ points Done. + +/* User-callable entry point for marking string X. */ points Done. --- a/gcc/stringpool.c +++ b/gcc/stringpool.c @@ -49,7 +49,7 @@ static const char digit_vector[] = { struct ht *ident_hash; -static hashnode alloc_node (hash_table *); +static hashnode alloc_node (cpp_hash_table *); static int mark_ident (struct cpp_reader *, hashnode, const void *); static void * @@ -70,7 +70,7 @@ init_stringpool (void) /* Allocate a hash node. */ static hashnode -alloc_node (hash_table *table ATTRIBUTE_UNUSED) +alloc_node (cpp_hash_table *table ATTRIBUTE_UNUSED) { return GCC_IDENT_TO_HT_IDENT (make_node (IDENTIFIER_NODE)); } These changes are not in the ChangeLog, are they intentional? Sorry. They belong to the hash table patch. Both patches touched the same file and I missed splitting these hunks. diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c +/* Garbage collection support for edge_def. */ + +extern void gt_ggc_mx (tree); +extern void gt_ggc_mx (gimple); +extern void gt_ggc_mx (rtx); +extern void gt_ggc_mx (basic_block); +/* PCH support for edge_def. */ + +extern void gt_pch_nx (tree); +extern void gt_pch_nx (gimple); +extern void gt_pch_nx (rtx); +extern void gt_pch_nx (basic_block); I wonder if these externs can be avoided by including gtype-desc.h. I realize that gtype-desc.h declares a lot of stuff, but if tree-cfg.c already declares GC roots, then it already should be pulling that header in in through gt-tree-cfg.h. Not really. These are never really defined anywhere. They are emitted in gtype-desc.c, but we never emit prototypes for them. Going down that road ends up in turning gengtype upside-down. Diego.
Re: Merge C++ conversion into trunk (3/6 - gengtype C++ support)
On 12-08-14 04:38 , Dodji Seketeli wrote: Hello Diego, Just some minor comments. Diego Novillo dnovi...@google.com a écrit: [...] +@section User-provided marking routines for template types +When a template type @code{TP} is marked with @code{GTY}, all +instances of that type are considered user-provided types. This means +that the individual instances of @code{TP} do not need to marked with s/to marked/to be marked/ Done. +@code{GTY}. The user needs to provide template functions to mark all +the fields of the type. + +The following code snippets represent all the functions that need to +be provided. Note that type @code{TP} may reference to more than one +type. In these snippets, there is only one type @code{T}, but there +could be more. + +@smallexample +templatetypename T +void gt_ggc_mx (TPT *tp) +@{ Just for my education, for the marking routines in general why having the parameter tp be a pointer, rather than TPT tp ? It would be better, yes. But since gengtype does not generate files that include the proper headers, we can't. We can get away with forward declaring the struct names and passing pointers around, but the minute you add a type reference, all hell breaks loose. Diego.
Re: LEA-splitting improvement patch.
Uros, Let me try to explain you why I used such code duplication: Here we have a common case of LEA with 3 different registers - r0 (target), r1(base), r2(index) and possible offset. To get the better scheduling we first try to determine what register is prefirable for inititial setting - r1 or r2 through find_nearest_reg_def. And then we generate the following sequence of instructions: r0 = r_best; r0 = $const, r0 r0 = r_worse, r0 that can save 2 cycles for Atom since first 2 instructions can be hoisted up. I could not find better way for coding it. Below is modified ChangeLog. 2012-08-14 Yuri Rumyantsev ysrum...@gmail.com * config/i386/i386-protos.h (ix86_split_lea_for_addr) : Add additional argument. * config/i386/i386.md (ix86_split_lea_for_addr) : Add additional argument curr_insn. * config/i386/i386.c (ix86_split_lea_for_addr): Do instructions reodering to get opportunities for better scheduling. (ix86_lea_outperforms): Prefer LEA if only split cost exceeds AGU stall. (find_nearest_reg-def): New function. Find nearest register definition used in address. 2012/8/14 Uros Bizjak ubiz...@gmail.com: On Tue, Aug 14, 2012 at 2:28 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: Thanks a lot forr your comments. I prepared new patch and ChangeLog. Testing of x32 is in progress. It it OK for trunk? 2012-08-14 Yuri Rumyantsev ysrum...@gmail.com * config/i386/i386-protos.h (ix86_split_lea_for_addr) : Add additional argument. * config/i386/i386.md (ix86_split_lea_for_addr) : Add additional argument curr_insn. * config/i386/i386.c (ix86_split_lea_for_addr): Do instructions reodering to get opportunities for better scheduling. (ix86_lea_outperforms): Do more aggressive lea splitting. You are not doing splitting in ix86_lea_outperforms. (find_nearest_reg-def): New function. Find nearest register definition used in address. Just say: (find_nearest_reg_def): New function. + emit_insn (gen_rtx_SET (VOIDmode, target, tmp)); + if (parts.disp parts.disp != const0_rtx) +ix86_emit_binop (PLUS, mode, target, parts.disp); + ix86_emit_binop (PLUS, mode, target, tmp1); + return; Can you explain, why you have to duplicate this code? Here you generate the same sequence as in the code below. Use tmp and tmp1 in the way that it will fit existing code. Uros.
[PATCH][RFC] Fix PR54201, share constant pool entries for CONST_VECTORs
This implements constant pool entry sharing for CONST_VECTORs with the same bit-pattern by canonicalizing them to the same-sized mode with the least number of elements. Ideally we would be able to hash and compare the in-memory representation of a constant (together with its alignment requirement), but I'm not aware of any RTL facility that would be equivalent to native_{interpret,encode}_expr. So this just handles CONST_VECTORs as requested in the bugreport. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Any comments? It would be interesting to pursue a way to share constant parts (beware of endian and odd alignment issues) by means of simply inserting more entries into the hashtable. Not sure if offsetted .LC references are valid as result of force_const_mem though (well, we could try to share the part at offset zero). Thanks, Richard. 2012-08-14 Richard Guenther rguent...@suse.de PR middle-end/54201 * varasm.c (force_const_mem): Canonicalize CONST_VECTORs to the same-sized mode with the least number of elements for the purpose of constant slot sharing. * gcc.target/i386/pr54201.c: New testcase. Index: gcc/varasm.c === *** gcc/varasm.c(revision 190381) --- gcc/varasm.c(working copy) *** force_const_mem (enum machine_mode mode, *** 3482,3489 { struct constant_descriptor_rtx *desc, tmp; struct rtx_constant_pool *pool; char label[256]; ! rtx def, symbol; hashval_t hash; unsigned int align; void **slot; --- 3482,3490 { struct constant_descriptor_rtx *desc, tmp; struct rtx_constant_pool *pool; + enum machine_mode orig_mode = mode; char label[256]; ! rtx def, symbol, res; hashval_t hash; unsigned int align; void **slot; *** force_const_mem (enum machine_mode mode, *** 3500,3505 --- 3501,3518 ? shared_constant_pool : crtl-varasm.pool); + /* Canonicalize CONST_VECTORs to the mode with the least number of + elements assuming that alignment requirements are not worse + for the original mode. */ + if (GET_CODE (x) == CONST_VECTOR) + { + while (GET_MODE_SIZE (mode) +== GET_MODE_SIZE (GET_MODE_WIDER_MODE (mode))) + mode = GET_MODE_WIDER_MODE (mode); + x = simplify_subreg (mode, x, orig_mode, 0); + gcc_assert (x != NULL_RTX); + } + /* Lookup the value in the hashtable. */ tmp.constant = x; tmp.mode = mode; *** force_const_mem (enum machine_mode mode, *** 3509,3515 /* If the constant was already present, return its memory. */ if (desc) ! return copy_rtx (desc-mem); /* Otherwise, create a new descriptor. */ desc = ggc_alloc_constant_descriptor_rtx (); --- 3522,3532 /* If the constant was already present, return its memory. */ if (desc) ! { ! res = copy_rtx (desc-mem); ! PUT_MODE (res, orig_mode); ! return res; ! } /* Otherwise, create a new descriptor. */ desc = ggc_alloc_constant_descriptor_rtx (); *** force_const_mem (enum machine_mode mode, *** 3573,3579 if (GET_CODE (x) == LABEL_REF) LABEL_PRESERVE_P (XEXP (x, 0)) = 1; ! return copy_rtx (def); } /* Given a constant pool SYMBOL_REF, return the corresponding constant. */ --- 3590,3598 if (GET_CODE (x) == LABEL_REF) LABEL_PRESERVE_P (XEXP (x, 0)) = 1; ! res = copy_rtx (def); ! PUT_MODE (res, orig_mode); ! return res; } /* Given a constant pool SYMBOL_REF, return the corresponding constant. */ Index: gcc/testsuite/gcc.target/i386/pr54201.c === *** gcc/testsuite/gcc.target/i386/pr54201.c (revision 0) --- gcc/testsuite/gcc.target/i386/pr54201.c (working copy) *** *** 0 --- 1,15 + /* { dg-do compile } */ + /* { dg-options -O -msse2 } */ + + #include emmintrin.h + + __m128i test(__m128i value) + { + __m128i mask = _mm_set1_epi8(1); + return _mm_cmpeq_epi8(_mm_and_si128(value, mask), mask); + } + + /* We should share constant slots for V16QI { 1, ... 1 } and its V2DI +representation. */ + + /* { dg-final { scan-assembler-not LC1 } } */
[wwwdocs] Update Fortran secrion in 4.8/changes.html
Attached is the first 4.8 merge of the Fortran related changes from wiki/Gfortran#news into the 4.8 release notes. I have committed the patch as obvious, however, I am happy for any comments. Possibly easier to read: http://gcc.gnu.org/gcc-4.8/changes.html (all in the Fortran section) Tobias PS: Still to do: Update the manual's status section and interop section for TYPE(*)/DIMENSION(..). Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.10 diff -u -p -r1.10 changes.html --- changes.html 10 Aug 2012 16:25:46 - 1.10 +++ changes.html 14 Aug 2012 13:52:34 - @@ -43,9 +43,7 @@ by this change./p /ul -!-- h2New Languages and Language specific improvements/h2 --- !-- h3Ada/h3 @@ -66,9 +64,62 @@ by this change./p h3C++/h3 -- -!-- h3 id=fortranFortran/h3 --- + ul +liThe codea +href=http://gcc.gnu.org/onlinedocs/gfortran/Error-and-Warning-Options.html; +-Wc-binding-type/a/code warning flag has been added (by default +disabled), which warns if the a variable might not be C interoperable. In +particular, if the variable has been declared using an intrinsic type with +default kind instead of using a kind parameter defined for C +interoperability in the intrinsic codeISO_C_Binding/code module. Before, +the warning was always printed./li + +liThe a +href=http://gcc.gnu.org/onlinedocs/gfortran/Error-and-Warning-Options.html; +code-Wrealloc-lhs/code/a and code-Wrealloc-lhs-all/code warning +flags have been added, which diagnose when code to is inserted for automatic +(re)allocation of a variable during assignment. The flag can be used to +decide whether it is safe to use codea +href=http://gcc.gnu.org/onlinedocs/gfortran/Code-Gen-Options.html; +-fno-realloc-lhs/a/code. Additionally, it can be used to find automatic +(re)allocation in hot loops. (For arrays, replacing qcodevar=/code/q +by qcodevar(:)=/code/q disables the automatic reallocation.)li + +liReading floating point numbers which use qcodeq/code/q for the +exponential (such as code4.0q0/code) is now supported as vendor +extension for better compatibility with old data files. It is strongly +recommended to use for I/O the equivalent but standard conforming +qcodee/code/q (such as code4.0e0/code). [For the Fortran +source code, consider replacing the qcodeq/code/q in +floating-point literals by a kind parameter (e.g. code4.0e0_qp/code +with a suitable codeqp/code). Note that ndash; in the Fortran +source code ndash; replacing qcodeq/code/q by a simple +qcodee/code/q is emnot/em equivalent.]/li + +liThe codeGFORTRAN_TMPDIR/code environment variable, for specifying +a non-default directory for files opened with codeSTATUS=SCRATCH/code, +is not used anymore. Instead gfortran checks the POSIX/GNU standard +codeTMPDIR/code environment variable. If codeTMPDIR/code is not +defined, gfortran falls back to other methods to determine the directory +for temporary files as documented in the +a href=http://gcc.gnu.org/onlinedocs/gfortran/TMPDIR.html;user +manual/a./li + +lia href=http://gcc.gnu.org/wiki/TS29113Status;TS 29113/a: +ul + liAssumed types (codeTYPE(*)/code) are now supported./li + + liExperimental support for assumed-rank arrays + (codedimension(..)/code) has been added. Note that currently + gfortran's own array descriptor is used, which is different from the + one defined in TS29113, see a + href=http://gcc.gnu.org/viewcvs/trunk/libgfortran/libgfortran.h?content-type=text%2Fplainview=co; + gfortran's header file/a or use the a + href=http://chasm-interop.sourceforge.net/;Chasm Language + Interoperability Tools/a./li +/ul/li + /ul !-- h3Java (GCJ)/h3 Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.11 diff -u -p -r1.11 changes.html --- changes.html 14 Aug 2012 13:57:59 - 1.11 +++ changes.html 14 Aug 2012 13:59:46 - @@ -84,7 +84,7 @@ by this change./p href=http://gcc.gnu.org/onlinedocs/gfortran/Code-Gen-Options.html; -fno-realloc-lhs/a/code. Additionally, it can be used to find automatic (re)allocation in hot loops. (For arrays, replacing qcodevar=/code/q -by qcodevar(:)=/code/q disables the automatic reallocation.)li +by qcodevar(:)=/code/q disables the automatic reallocation.)/li liReading floating point numbers which use qcodeq/code/q for the exponential (such as code4.0q0/code) is now supported as vendor @@ -114,7 +114,7 @@ by this change./p (codedimension(..)/code) has been added. Note that currently gfortran's own array descriptor is used, which is different from the one defined in TS29113, see a -
Re: [PATCH, Android] Runtime stack protector enabling for Android target
OK, provided that the patches in the above threads apply without conflicts. If there are conflicts, please repost for review. Comitted to 4.7 branch: http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00360.html Thanks, K
[google] Update contrib/testsuite-management/powerpc-grtev3-linux-gnu.xfail (issue6454147)
Update contrib/testsuite-management/powerpc-grtev3-linux-gnu.xfail. Tested with build followed by validate_failures.py. Okay for all applicable branches? 2012-08-14 Simon Baldwin sim...@google.com * testsuite-management/powerpc-grtev3-linux-gnu.xfail: Add new entries for soft-float. Index: contrib/testsuite-management/powerpc-grtev3-linux-gnu.xfail === --- contrib/testsuite-management/powerpc-grtev3-linux-gnu.xfail (revision 190382) +++ contrib/testsuite-management/powerpc-grtev3-linux-gnu.xfail (working copy) @@ -1,6 +1,9 @@ # Temporarily ignore gcc pr54127. expire=20121031 | FAIL: gcc.dg/torture/pr53589.c -O3 -g (test for excess errors) expire=20121031 | FAIL: gcc.dg/torture/pr53589.c -O3 -g (internal compiler error) +# Temporarily ignore Google ref b/6983319. +expire=20121031 | FAIL: gcc.target/powerpc/regnames-1.c (test for excess errors) +expire=20121031 | FAIL: gcc.target/powerpc/regnames-1.c (internal compiler error) FAIL: gfortran.dg/bessel_6.f90 -O0 execution test FAIL: gfortran.dg/bessel_6.f90 -O1 execution test @@ -171,6 +174,43 @@ FAIL: gcc.target/powerpc/pr46728-4.c sca FAIL: gcc.target/powerpc/pr46728-7.c scan-assembler-not pow FAIL: gcc.target/powerpc/pr46728-8.c scan-assembler-not pow +# Entries due to soft-float. +FAIL: g++.dg/cdce3.C -std=gnu++98 execution test +FAIL: g++.dg/cdce3.C -std=gnu++11 execution test +FAIL: g++.dg/tree-prof/mversn15.C execution,-fprofile-generate +UNRESOLVED: g++.dg/tree-prof/mversn15.C execution,-fprofile-use +UNRESOLVED: g++.dg/tree-prof/mversn15.C compilation, -fprofile-use +FAIL: g++.dg/tree-prof/mversn15a.C execution,-fprofile-generate +UNRESOLVED: g++.dg/tree-prof/mversn15a.C execution,-fprofile-use +UNRESOLVED: g++.dg/tree-prof/mversn15a.C compilation, -fprofile-use +FAIL: gcc.dg/torture/fp-int-convert-long-double.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test +FAIL: gcc.target/powerpc/lhs-1.c scan-assembler-times nop 3 +FAIL: gcc.target/powerpc/lhs-2.c scan-assembler ori 1,1,0 +FAIL: gcc.target/powerpc/lhs-3.c scan-assembler ori 2,2,0 +FAIL: gcc.target/powerpc/loop_align.c scan-assembler .p2align 5,,31 +FAIL: gcc.target/powerpc/pr46728-1.c scan-assembler-times fsqrt 2 +FAIL: gcc.target/powerpc/pr46728-16.c scan-assembler fmadd +FAIL: gcc.target/powerpc/pr46728-2.c scan-assembler-times fsqrt 4 +FAIL: gcc.target/powerpc/pr46728-3.c scan-assembler-times sqrt 4 +FAIL: gcc.target/powerpc/pr46728-5.c scan-assembler-times cbrt 2 +FAIL: gcc.target/powerpc/pr52775.c scan-assembler-times fcfid 2 +FAIL: gfortran.dg/actual_array_constructor_3.f90 -O3 -fomit-frame-pointer -funroll-loops execution test +FAIL: gfortran.dg/actual_array_constructor_3.f90 -O3 -g execution test +FAIL: gfortran.dg/actual_array_constructor_3.f90 -O2 execution test +FAIL: gfortran.dg/actual_array_constructor_3.f90 -O3 -fomit-frame-pointer execution test +FAIL: gfortran.dg/actual_array_constructor_3.f90 -O1 execution test +FAIL: gfortran.dg/actual_array_constructor_3.f90 -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions execution test +FAIL: gfortran.dg/actual_array_constructor_3.f90 -Os execution test +FAIL: gfortran.dg/actual_array_constructor_3.f90 -O0 execution test +FAIL: gfortran.dg/norm2_3.f90 -O3 -fomit-frame-pointer -funroll-loops execution test +FAIL: gfortran.dg/norm2_3.f90 -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions execution test +FAIL: gfortran.dg/norm2_3.f90 -O0 execution test +FAIL: gfortran.dg/norm2_3.f90 -Os execution test +FAIL: gfortran.dg/norm2_3.f90 -O2 execution test +FAIL: gfortran.dg/norm2_3.f90 -O3 -g execution test +FAIL: gfortran.dg/norm2_3.f90 -O3 -fomit-frame-pointer execution test +FAIL: gfortran.dg/norm2_3.f90 -O1 execution test + # See http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00141.html. Revert once # that is resolved. UNRESOLVED: 23_containers/map/element_access/2.cc compilation failed to produce executable -- This patch is available for review at http://codereview.appspot.com/6454147
[Patch ARM] Fix PR54212 - Remove predicable attribute from Advanced SIMD patterns in the ARM backend.
Hi, This fixes PR target/54212.. The problem here was we were marking a number of patterns in neon.md as predicable. Advanced SIMD instructions are not predicable in ARM state, however are allowed to exist in Thumb2 in IT blocks ( though this is a feature that is deprecated and is documented in the latest ARM ARM Issue C.B ) . This therefore removes the predicable attribute from all such instructions in the Neon backend. This is currently undergoing regression testing on armv7-a cross, will apply if no regressions and I intend backporting this atleast to 4.7 branch as this is an issue in the backend since the original days of Advanced SIMD support and the bug report was reported on the 4.7 branch where it is reproducible. I would like to take this back to 4.6 branch as well but would like to do so after it has lived for a while on trunk / 4.7 . regards, Ramana 2012-08-14 Ramana Radhakrishnan ramana.radhakrish...@linaro.org PR target/54212 * config/arm/neon.md (vec_setmode_internal VD,VQ): Do not mark as predicable. Adjust asm template. (vec_setv2di_internal): Likewise. (vec_extractmode VD, VQ): Likewise. (vec_extractv2di): Likewise. (neon_vget_lanemode_sext_internal VD, VQ): Likewise. (neon_vset_lanemode_sext_internal VD, VQ): Likewise. (neon_vdup_nmode VX, V32): Likewise. (neon_vdup_nv2di): Likewise. diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 7142c98..12c7934 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -434,10 +434,9 @@ elt = GET_MODE_NUNITS (MODEmode) - 1 - elt; operands[2] = GEN_INT (elt); - return vmov%?.V_sz_elem\t%P0[%c2], %1; + return vmov.V_sz_elem\t%P0[%c2], %1; } - [(set_attr predicable yes) - (set_attr neon_type neon_mcr)]) + [(set_attr neon_type neon_mcr)]) (define_insn vec_setmode_internal [(set (match_operand:VQ 0 s_register_operand =w) @@ -460,10 +459,9 @@ operands[0] = gen_rtx_REG (V_HALFmode, regno + hi); operands[2] = GEN_INT (elt); - return vmov%?.V_sz_elem\t%P0[%c2], %1; + return vmov.V_sz_elem\t%P0[%c2], %1; } - [(set_attr predicable yes) - (set_attr neon_type neon_mcr)] + [(set_attr neon_type neon_mcr)] ) (define_insn vec_setv2di_internal @@ -480,10 +478,9 @@ operands[0] = gen_rtx_REG (DImode, regno); - return vmov%?\t%P0, %Q1, %R1; + return vmov\t%P0, %Q1, %R1; } - [(set_attr predicable yes) - (set_attr neon_type neon_mcr_2_mcrr)] + [(set_attr neon_type neon_mcr_2_mcrr)] ) (define_expand vec_setmode @@ -511,10 +508,9 @@ elt = GET_MODE_NUNITS (MODEmode) - 1 - elt; operands[2] = GEN_INT (elt); } - return vmov%?.V_uf_sclr\t%0, %P1[%c2]; + return vmov.V_uf_sclr\t%0, %P1[%c2]; } - [(set_attr predicable yes) - (set_attr neon_type neon_bp_simple)] + [(set_attr neon_type neon_bp_simple)] ) (define_insn vec_extractmode @@ -535,10 +531,9 @@ operands[1] = gen_rtx_REG (V_HALFmode, regno + hi); operands[2] = GEN_INT (elt); - return vmov%?.V_uf_sclr\t%0, %P1[%c2]; + return vmov.V_uf_sclr\t%0, %P1[%c2]; } - [(set_attr predicable yes) - (set_attr neon_type neon_bp_simple)] + [(set_attr neon_type neon_bp_simple)] ) (define_insn vec_extractv2di @@ -552,10 +547,9 @@ operands[1] = gen_rtx_REG (DImode, regno); - return vmov%?\t%Q0, %R0, %P1 @ v2di; + return vmov\t%Q0, %R0, %P1 @ v2di; } - [(set_attr predicable yes) - (set_attr neon_type neon_int_1)] + [(set_attr neon_type neon_int_1)] ) (define_expand vec_initmode @@ -2622,10 +2616,9 @@ elt = GET_MODE_NUNITS (MODEmode) - 1 - elt; operands[2] = GEN_INT (elt); } - return vmov%?.sV_sz_elem\t%0, %P1[%c2]; + return vmov.sV_sz_elem\t%0, %P1[%c2]; } - [(set_attr predicable yes) - (set_attr neon_type neon_bp_simple)] + [(set_attr neon_type neon_bp_simple)] ) (define_insn neon_vget_lanemode_zext_internal @@ -2642,10 +2635,9 @@ elt = GET_MODE_NUNITS (MODEmode) - 1 - elt; operands[2] = GEN_INT (elt); } - return vmov%?.uV_sz_elem\t%0, %P1[%c2]; + return vmov.uV_sz_elem\t%0, %P1[%c2]; } - [(set_attr predicable yes) - (set_attr neon_type neon_bp_simple)] + [(set_attr neon_type neon_bp_simple)] ) (define_insn neon_vget_lanemode_sext_internal @@ -2668,12 +2660,11 @@ ops[0] = operands[0]; ops[1] = gen_rtx_REG (V_HALFmode, regno + 2 * (elt / halfelts)); ops[2] = GEN_INT (elt_adj); - output_asm_insn (vmov%?.sV_sz_elem\t%0, %P1[%c2], ops); + output_asm_insn (vmov.sV_sz_elem\t%0, %P1[%c2], ops); return ; } - [(set_attr predicable yes) - (set_attr neon_type neon_bp_simple)] + [(set_attr neon_type neon_bp_simple)] ) (define_insn neon_vget_lanemode_zext_internal @@ -2696,12 +2687,11 @@ ops[0] = operands[0]; ops[1] = gen_rtx_REG (V_HALFmode, regno + 2 * (elt / halfelts)); ops[2] = GEN_INT (elt_adj); - output_asm_insn (vmov%?.uV_sz_elem\t%0, %P1[%c2], ops); + output_asm_insn (vmov.uV_sz_elem\t%0, %P1[%c2], ops);
Re: [PATCH][RFC] Fix PR54201, share constant pool entries for CONST_VECTORs
On Tue, 14 Aug 2012, Richard Guenther wrote: This implements constant pool entry sharing for CONST_VECTORs with the same bit-pattern by canonicalizing them to the same-sized mode with the least number of elements. Ideally we would be able to hash and compare the in-memory representation of a constant (together with its alignment requirement), but I'm not aware of any RTL facility that would be equivalent to native_{interpret,encode}_expr. So this just handles CONST_VECTORs as requested in the bugreport. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Any comments? While it shares constant pool entries, CSE is not smart enough to CSE loads with different mode from the same constant pool entry. So we don't see the looked for reduction in register pressure and thus IMHO the patch is not worth the trouble in its current form. Richard.
Re: LEA-splitting improvement patch.
On Tue, Aug 14, 2012 at 3:35 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: Uros, Let me try to explain you why I used such code duplication: Here we have a common case of LEA with 3 different registers - r0 (target), r1(base), r2(index) and possible offset. To get the better scheduling we first try to determine what register is prefirable for inititial setting - r1 or r2 through find_nearest_reg_def. And then we generate the following sequence of instructions: r0 = r_best; r0 = $const, r0 r0 = r_worse, r0 that can save 2 cycles for Atom since first 2 instructions can be hoisted up. I could not find better way for coding it. If it is important to put adding of const before adding of the register, then you can emit similar sequence for other cases, too. Something like following: --cut here-- ... { if (regno0 == regno1) tmp = parts.index; else if (regno0 == regno2) tmp = parts.base; else { rtx tmp1; /* regno1: base, regno2: index */ if (find_nearest_reg_def (insn, regno1, regno2)) tmp1 = parts.index, tmp = parts.base; else tmp1 = parts.base, tmp = parts.index; emit_insn (gen_rtx_SET (VOIDmode, target, tmp1)); } if (parts.disp parts.disp != const0_rtx) ix86_emit_binop (PLUS, mode, target, parts.disp); ix86_emit_binop (PLUS, mode, target, tmp); return; } --cut here-- I prepared new patch and ChangeLog. Testing of x32 is in progress. You didn't fix vertical spaces and tab issues in new patch. Uros.
Re: PATCH [x86_64] PR20020 - 128 bit structs not targeted to TImode
On Tue, 14 Aug 2012, Jakub Jelinek wrote: On Mon, Aug 13, 2012 at 09:20:32PM -0700, Gary Funck wrote: --- gcc/testsuite/gcc.dg/pr20020-1.c(revision 0) +++ gcc/testsuite/gcc.dg/pr20020-1.c(revision 0) @@ -0,0 +1,25 @@ +/* Target is restricted to x86_64 type architectures, + to check that 128-bit struct's are represented + as TImode values. */ +/* { dg-require-effective-target int128 } */ +/* { dg-do compile { target { x86_64-*-* } } } */ Given this all the testcases should go into gcc/testsuite/gcc.target/i386/ And restricting the target to x86_64-*-* is wrong anyway, since any such test should also be run for i?86-*-* -m64. Use { target { ! { ia32 } } } instead if you want to disable just -m32 testing, { target lp64 } if you only want -m64 testing but not -m32 or -mx32. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH,i386] fma,fma4 and xop flags
On Mon, Aug 13, 2012 at 9:50 PM, Richard Henderson r...@redhat.com wrote: On 08/13/2012 12:33 PM, Uros Bizjak wrote: AFAIU fma3 is better than fma4 for bdver2 (the only CPU that implements both FMA sets). Current description of bdver2 doesn't even enable fma4 in processor_alias_table due to this fact. The change you are referring to adds preference for fma3 insn set for generic code (not FMA4 builtins!), even when fma4 is enabled. So, no matter which combination and sequence of -mfmfa -mfma4 or -mxop user passes to the compiler, only fma3 instructions will be generated. This rationale needs to appear as a comment above + (eq_attr isa fma4) +(symbol_ref TARGET_FMA4 !TARGET_FMA) I plan to commit following patch: --cut here-- Index: i386.md === --- i386.md (revision 190362) +++ i386.md (working copy) @@ -659,6 +659,9 @@ (eq_attr isa noavx2) (symbol_ref !TARGET_AVX2) (eq_attr isa bmi2) (symbol_ref TARGET_BMI2) (eq_attr isa fma) (symbol_ref TARGET_FMA) +;; Disable generation of FMA4 instructions for generic code +;; since FMA3 is preferred for targets that implement both +;; instruction sets. (eq_attr isa fma4) (symbol_ref TARGET_FMA4 !TARGET_FMA) ] --cut here-- Longer term we may well require some sort of (TARGET_FMA4 !(TARGET_FMA TARGET_PREFER_FMA3)) with an appropriate entry in ix86_tune_features to match. This won't work, since we have to prefer FMA3 also in case when only -mfma -mfma4 without -mtune=XX is used. We can add TARGET_FMA_BOTH though, but I doubt there will ever be target that implements both insn sets without preferences. Uros.
[wwwdocs] Announce switch to C++
Gerald, I have this queued up in my local tree, waiting for the final merge of the cxx-conversion branch. OK to install after the merge? Thanks. Diego. Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.858 diff -u -d -u -p -r1.858 index.html --- index.html 2 Jul 2012 12:10:37 - 1.858 +++ index.html 14 Aug 2012 16:02:21 - @@ -53,6 +53,17 @@ mission statement/a./p dl class=news +dtspanGCC now uses C++ as its implementation language/span +span class=date[2012-02-16]/span/dt +ddThe a href=http://gcc.gnu.org/wiki/cxx-conversion;cxx-conversion/a +branch has been merged into trunk. This switches GCC's implementation +language from C to a href=codingconventions.html#Cxx_ConventionsC++/a. +Additionally, some data structures have been re-implemented in C++ +(more details in the a +href=http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00711.html;merge +announcement/a). This work was contributed by Lawrence Crowl and +Diego Novillo of Google./dd + dtspana href=gcc-4.5/GCC 4.5.4/a released/span span class=date[2012-07-02]/span/dt dd/dd Index: gcc-4.8/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.12 diff -u -d -u -p -r1.12 changes.html --- gcc-4.8/changes.html14 Aug 2012 13:59:54 - 1.12 +++ gcc-4.8/changes.html14 Aug 2012 16:02:21 - @@ -13,6 +13,12 @@ h2Caveats/h2 +pGCC now uses C++ as its implementation language. This means that +to build GCC from sources, you will need a C++ compiler that +understands C++ 2003. For more details on the rationale and specific +changes, please refer to the a +href=http://gcc.gnu.org/wiki/cxx-conversion;C++ conversion/a +page./p pTo enable the Graphite framework for loop optimizations you now need CLooG version 0.17.0 and ISL version 0.10. Both can be obtained
Re: PATCH [x86_64] PR20020 - 128 bit structs not targeted to TImode
On 08/14/12 15:33:10, Joseph S. Myers wrote: On Tue, 14 Aug 2012, Jakub Jelinek wrote: On Mon, Aug 13, 2012 at 09:20:32PM -0700, Gary Funck wrote: --- gcc/testsuite/gcc.dg/pr20020-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr20020-1.c (revision 0) @@ -0,0 +1,25 @@ +/* Target is restricted to x86_64 type architectures, + to check that 128-bit struct's are represented + as TImode values. */ +/* { dg-require-effective-target int128 } */ +/* { dg-do compile { target { x86_64-*-* } } } */ Given this all the testcases should go into gcc/testsuite/gcc.target/i386/ And restricting the target to x86_64-*-* is wrong anyway, since any such test should also be run for i?86-*-* -m64. Use { target { ! { ia32 } } } instead if you want to disable just -m32 testing, { target lp64 } if you only want -m64 testing but not -m32 or -mx32. How about: 1. Move the test to gcc/testsuite/gcc.target/i386/ 2. The comment is amended to read: /* Check that 128-bit struct's are represented as TImode values. */ 3. This test is retained: /* { dg-require-effective-target int128 } */ 4. This target test is removed: /* { dg-do compile } */ It is possible that dg-require-effective-target int128 is too restrictive (in the sense that some x86 target might in theory support TImode, but not __int128_t), but at least some reasonable test coverage is guaranteed. - Gary
Re: PATCH [x86_64] PR20020 - 128 bit structs not targeted to TImode
On Tue, Aug 14, 2012 at 9:12 AM, Gary Funck g...@intrepid.com wrote: On 08/14/12 15:33:10, Joseph S. Myers wrote: On Tue, 14 Aug 2012, Jakub Jelinek wrote: On Mon, Aug 13, 2012 at 09:20:32PM -0700, Gary Funck wrote: --- gcc/testsuite/gcc.dg/pr20020-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr20020-1.c (revision 0) @@ -0,0 +1,25 @@ +/* Target is restricted to x86_64 type architectures, + to check that 128-bit struct's are represented + as TImode values. */ +/* { dg-require-effective-target int128 } */ +/* { dg-do compile { target { x86_64-*-* } } } */ Given this all the testcases should go into gcc/testsuite/gcc.target/i386/ And restricting the target to x86_64-*-* is wrong anyway, since any such test should also be run for i?86-*-* -m64. Use { target { ! { ia32 } } } instead if you want to disable just -m32 testing, { target lp64 } if you only want -m64 testing but not -m32 or -mx32. How about: 1. Move the test to gcc/testsuite/gcc.target/i386/ 2. The comment is amended to read: /* Check that 128-bit struct's are represented as TImode values. */ 3. This test is retained: /* { dg-require-effective-target int128 } */ 4. This target test is removed: /* { dg-do compile } */ It is possible that dg-require-effective-target int128 is too restrictive (in the sense that some x86 target might in theory support TImode, but not __int128_t), but at least some reasonable test coverage is guaranteed. I believe int128 requirement is correct. -- H.J.
Re: [PATCH] Combine location with block using block_locations
Hi, Dodji, Thanks for the review. I've fixed all the addressed issues. I'm attaching the related changes: Thanks, Dehao libcpp/ChangeLog: 2012-08-01 Dehao Chen de...@google.com * include/line-map.h (MAX_SOURCE_LOCATION): New value. (location_adhoc_data_init): New. (location_adhoc_data_fini): New. (get_combined_adhoc_loc): New. (get_data_from_adhoc_loc): New. (get_location_from_adhoc_loc): New. (COMBINE_LOCATION_DATA): New. (IS_ADHOC_LOC): New. (expanded_location): New field. * line-map.c (location_adhoc_data): New. (location_adhoc_data_htab): New. (curr_adhoc_loc): New. (location_adhoc_data): New. (allocated_location_adhoc_data): New. (location_adhoc_data_hash): New. (location_adhoc_data_eq): New. (location_adhoc_data_update): New. (get_combined_adhoc_loc): New. (get_data_from_adhoc_loc): New. (get_location_from_adhoc_loc): New. (location_adhoc_data_init): New. (location_adhoc_data_fini): New. (linemap_lookup): Change to use new location. (linemap_ordinary_map_lookup): Likewise. (linemap_macro_map_lookup): Likewise. (linemap_macro_map_loc_to_def_point): Likewise. (linemap_macro_map_loc_unwind_toward_spel): Likewise. (linemap_get_expansion_line): Likewise. (linemap_get_expansion_filename): Likewise. (linemap_location_in_system_header_p): Likewise. (linemap_location_from_macro_expansion_p): Likewise. (linemap_macro_loc_to_spelling_point): Likewise. (linemap_macro_loc_to_def_point): Likewise. (linemap_macro_loc_to_exp_point): Likewise. (linemap_resolve_location): Likewise. (linemap_unwind_toward_expansion): Likewise. (linemap_unwind_to_first_non_reserved_loc): Likewise. (linemap_expand_location): Likewise. (linemap_dump_location): Likewise. Index: libcpp/line-map.c === --- libcpp/line-map.c (revision 190209) +++ libcpp/line-map.c (working copy) @@ -25,6 +25,7 @@ #include line-map.h #include cpplib.h #include internal.h +#include hashtab.h static void trace_include (const struct line_maps *, const struct line_map *); static const struct line_map * linemap_ordinary_map_lookup (struct line_maps *, @@ -50,6 +51,135 @@ extern unsigned num_expanded_macros_counter; extern unsigned num_macro_tokens_counter; +/* Data structure to associate an arbitrary data to a source location. */ +struct location_adhoc_data { + source_location locus; + void *data; +}; + +/* The following data structure encodes a location with some adhoc data + and maps it to a new unsigned integer (called an adhoc location) + that replaces the original location to represent the mapping. + + The new adhoc_loc uses the highest bit as the enabling bit, i.e. if the + highest bit is 1, then the number is adhoc_loc. Otherwise, it serves as + the original location. Once identified as the adhoc_loc, the lower 31 + bits of the integer is used to index the location_adhoc_data array, + in which the locus and associated data is stored. */ + +static htab_t location_adhoc_data_htab; +static source_location curr_adhoc_loc; +static struct location_adhoc_data *location_adhoc_data; +static unsigned int allocated_location_adhoc_data; + +/* Hash function for location_adhoc_data hashtable. */ + +static hashval_t +location_adhoc_data_hash (const void *l) +{ + const struct location_adhoc_data *lb = + (const struct location_adhoc_data *) l; + return (hashval_t) lb-locus + (size_t) lb-data; +} + +/* Compare function for location_adhoc_data hashtable. */ + +static int +location_adhoc_data_eq (const void *l1, const void *l2) +{ + const struct location_adhoc_data *lb1 = + (const struct location_adhoc_data *) l1; + const struct location_adhoc_data *lb2 = + (const struct location_adhoc_data *) l2; + return lb1-locus == lb2-locus lb1-data == lb2-data; +} + +/* Update the hashtable when location_adhoc_data is reallocated. */ + +static int +location_adhoc_data_update (void **slot, void *data) +{ + *((char **) slot) += ((char *) location_adhoc_data - (char *) data); + return 1; +} + +/* Combine LOCUS and DATA to a combined adhoc loc. */ + +source_location +get_combined_adhoc_loc (source_location locus, void *data) +{ + struct location_adhoc_data lb; + struct location_adhoc_data **slot; + + linemap_assert (data); + + if (IS_ADHOC_LOC (locus)) +locus = location_adhoc_data[locus MAX_SOURCE_LOCATION].locus; + if (locus == 0 data == NULL) +return 0; + lb.locus = locus; + lb.data = data; + slot = (struct location_adhoc_data **) + htab_find_slot (location_adhoc_data_htab, lb, INSERT); + if (*slot == NULL) +{ + *slot = location_adhoc_data + curr_adhoc_loc; + location_adhoc_data[curr_adhoc_loc] = lb; + if
Re: [wwwdocs] Document Runtime CPU detection builtins
Hi Gerald, Is this release note alright? Thanks, -Sri. On Fri, Aug 10, 2012 at 7:20 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, I have added a release note for x86 builtins __builtin_cpu_is and __builtin_cpu_supports. They were checked in to trunk in rev. 186789. Is this ok to submit? Thanks, -Sri.
Re: [wwwdocs] Document Runtime CPU detection builtins
+ger...@pfiefer.com On Tue, Aug 14, 2012 at 10:51 AM, Sriraman Tallam tmsri...@google.com wrote: Hi Gerald, Is this release note alright? Thanks, -Sri. On Fri, Aug 10, 2012 at 7:20 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, I have added a release note for x86 builtins __builtin_cpu_is and __builtin_cpu_supports. They were checked in to trunk in rev. 186789. Is this ok to submit? Thanks, -Sri.
[patch] timevar TLC
Hello, Many unused timevars, many timevars that measure completely different passes, passes with the wrong timevar, etc. Time for a bit of maintenance / janitorial. Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? Ciao! Steven timevar_tlc.diff Description: Binary data
Re: [patch] timevar TLC
On 12-08-14 14:26 , Steven Bosscher wrote: Hello, Many unused timevars, many timevars that measure completely different passes, passes with the wrong timevar, etc. Time for a bit of maintenance / janitorial. Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? Ciao! Steven * timevar.def (TV_VARPOOL, TV_WHOPR_WPA_LTRANS_EXEC, TV_LIFE, TV_LIFE_UPDATE, TV_DF_UREC, TV_INLINE_HEURISTICS, TV_TREE_LINEAR_TRANSFORM, TV_TREE_LOOP_INIT, TV_TREE_LOOP_FINI, TV_VPT, TV_LOCAL_ALLOC, TV_GLOBAL_ALLOC, TV_SEQABSTR): Remove. (TV_IPA_INLINING, TV_FLATTEN_INLINING, TV_EARLY_INLINING, TV_INLINE_PARAMETERS, TV_LOOP_INIT, TV_LOOP_FINI): New. * timevar.c (timevar_print): Make printing width of timevar names more flexible, but enforce maximum length. * ipa-inline.c (pass_early_inline): Use TV_EARLY_INLINING. (pass_ipa_inline): Use TV_IPA_INLINING. * ipa-inline-analysis.c (pass_inline_parameters): Use TV_INLINE_HEURISTICS. * tree-ssa-loop.c (pass_tree_loop_init): No timevar for wrapper pass. (pass_tree_loop_done): Likewise. * final.c (pass_shorten_branches): Use TV_SHORTEN_BRANCH. * loop-init.c (loop_optimizer_init): Push/pop TV_LOOP_INIT. (loop_optimizer_finalize): Push/pop TV_LOOP_FINI. Looks fine, except: @@ -505,6 +507,16 @@ timevar_print (FILE *fp) TIMEVAR. */ start_time = now; +#ifdef ENABLE_CHECKING + /* Pester those who add timevars with too long names. */ + for (id = 0; id (unsigned int) TIMEVAR_LAST; ++id) +{ + struct timevar_def *tv = timevars[(timevar_id_t) id]; + if ((timevar_id_t) id != TV_TOTAL tv-used) + gcc_assert (strlen (tv-name) = name_width); +} +#endif I'm not liking this too much. I would rather do truncation or wrapping. Not ICEing. And we'd do this all the time, not just with checking enabled. I suppose this works with -ftime-report right? (I'm thinking about the new code that emits phase-level timers). Thanks. Diego.
Re: Merge C++ conversion into trunk (0/6 - Overview)
On 12-08-14 09:48 , Diego Novillo wrote: This merge touches several files, so I'm thinking that the best time is going to be on Thu 16/Aug around 2:00 GMT. So, the fixes I needed from Lawrence are already in so we can proceed with the merge. I'll commit the merge tonight at ~2:00 GMT. After the merge is in, I will send an announcement and request major branch merges to wait for another 24 hrs to allow testers the chance to pick up this merge. Thanks. Diego.
Re: [patch] timevar TLC
On Tue, Aug 14, 2012 at 8:40 PM, Diego Novillo dnovi...@google.com wrote: On 12-08-14 14:26 , Steven Bosscher wrote: Hello, Many unused timevars, many timevars that measure completely different passes, passes with the wrong timevar, etc. Time for a bit of maintenance / janitorial. Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? Ciao! Steven * timevar.def (TV_VARPOOL, TV_WHOPR_WPA_LTRANS_EXEC, TV_LIFE, TV_LIFE_UPDATE, TV_DF_UREC, TV_INLINE_HEURISTICS, TV_TREE_LINEAR_TRANSFORM, TV_TREE_LOOP_INIT, TV_TREE_LOOP_FINI, TV_VPT, TV_LOCAL_ALLOC, TV_GLOBAL_ALLOC, TV_SEQABSTR): Remove. (TV_IPA_INLINING, TV_FLATTEN_INLINING, TV_EARLY_INLINING, TV_INLINE_PARAMETERS, TV_LOOP_INIT, TV_LOOP_FINI): New. * timevar.c (timevar_print): Make printing width of timevar names more flexible, but enforce maximum length. * ipa-inline.c (pass_early_inline): Use TV_EARLY_INLINING. (pass_ipa_inline): Use TV_IPA_INLINING. * ipa-inline-analysis.c (pass_inline_parameters): Use TV_INLINE_HEURISTICS. * tree-ssa-loop.c (pass_tree_loop_init): No timevar for wrapper pass. (pass_tree_loop_done): Likewise. * final.c (pass_shorten_branches): Use TV_SHORTEN_BRANCH. * loop-init.c (loop_optimizer_init): Push/pop TV_LOOP_INIT. (loop_optimizer_finalize): Push/pop TV_LOOP_FINI. Looks fine, except: @@ -505,6 +507,16 @@ timevar_print (FILE *fp) TIMEVAR. */ start_time = now; +#ifdef ENABLE_CHECKING + /* Pester those who add timevars with too long names. */ + for (id = 0; id (unsigned int) TIMEVAR_LAST; ++id) +{ + struct timevar_def *tv = timevars[(timevar_id_t) id]; + if ((timevar_id_t) id != TV_TOTAL tv-used) + gcc_assert (strlen (tv-name) = name_width); +} +#endif I'm not liking this too much. I would rather do truncation or wrapping. Not ICEing. And we'd do this all the time, not just with checking enabled. Wrapping would be bad, it'd break scripts that parse the -ftime-report output. Truncation is a bit silly, too: If the name is always truncated, why have the long name in the first place? I chose for this gcc_assert solution to make absolutely sure nobody can add a timevar with an overlong name. I suppose this works with -ftime-report right? (I'm thinking about the new code that emits phase-level timers). Yes, I've been using -ftime-report a lot for PR54146, and the output format was itching... Ciao! Steven
Re: [patch] timevar TLC
On 12-08-14 15:06 , Steven Bosscher wrote: On Tue, Aug 14, 2012 at 8:40 PM, Diego Novillo dnovi...@google.com wrote: On 12-08-14 14:26 , Steven Bosscher wrote: @@ -505,6 +507,16 @@ timevar_print (FILE *fp) TIMEVAR. */ start_time = now; +#ifdef ENABLE_CHECKING + /* Pester those who add timevars with too long names. */ + for (id = 0; id (unsigned int) TIMEVAR_LAST; ++id) +{ + struct timevar_def *tv = timevars[(timevar_id_t) id]; + if ((timevar_id_t) id != TV_TOTAL tv-used) + gcc_assert (strlen (tv-name) = name_width); +} +#endif I'm not liking this too much. I would rather do truncation or wrapping. Not ICEing. And we'd do this all the time, not just with checking enabled. Wrapping would be bad, it'd break scripts that parse the -ftime-report output. Hm, good point. Truncation is a bit silly, too: If the name is always truncated, why have the long name in the first place? Yeah, I was more about wrapping it. But you make a good point about scripting. OK, unless anyone else has better ideas, let's put this in. The other cleanups are good enough, and we can revisit this later. Is 32 the longest we can tolerate? Thanks. Diego.
Re: [patch] timevar TLC
On Tue, Aug 14, 2012 at 9:17 PM, Diego Novillo dnovi...@google.com wrote: Is 32 the longest we can tolerate? This 32 is just currently the longest name length of all timevars (for straight-line strength reduction), but there are a few more long ones (PCH preprocessor state restore ...). I didn't look at the total length of the lines that -ftime-report produces now. It fits on my screen, that's what mattered to me :-) (FWIW it's been 80 characters for a long time now, since the memory stat dumps were added). Ciao! Steven
Re: [patch] timevar TLC
On 12-08-14 15:20 , Steven Bosscher wrote: On Tue, Aug 14, 2012 at 9:17 PM, Diego Novillo dnovi...@google.com wrote: Is 32 the longest we can tolerate? This 32 is just currently the longest name length of all timevars (for straight-line strength reduction), but there are a few more long ones (PCH preprocessor state restore ...). I didn't look at the total length of the lines that -ftime-report produces now. It fits on my screen, that's what mattered to me :-) Heh, OK. I'd like to make it 80 - len(everything else). Most folks use 80 col windows. (FWIW it's been 80 characters for a long time now, since the memory stat dumps were added). Yeah, it's annoying. Diego.
Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86
On Thu, Aug 9, 2012 at 3:17 PM, Ian Lance Taylor i...@google.com wrote: On Thu, Aug 9, 2012 at 9:39 AM, H.J. Lu hongjiu...@intel.com wrote: Bionic C library doesn't provide link.h. Does Bionic provide dl_iterate_phdr? If it does, I'll just note in passing that it would be straightforward to simply incorporate the required types and constants in unwind-dw2-fde-dip.c directly, and avoid the #include. If it doesn't, then of course nothing will make this code work correctly. dl_iterate_phdr is provided in libdl.so, which is always linked with dynamic executables: #define ANDROID_LIB_SPEC \ %{!static: -ldl} This patch fixes Android/x86 build on trunk. OK to install? Thanks. -- H.J. 2012-08-14 H.J. Lu hongjiu...@intel.com PR bootstrap/54209 * unwind-dw2-fde-dip.c (dl_phdr_info): New struct for Bionic C library. (ElfW): New macro for Bionic C library. Don't include link.h for Bionic C library. gcc-pr54157.patch Description: Binary data
Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86
On Tue, Aug 14, 2012 at 12:38 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Aug 9, 2012 at 3:17 PM, Ian Lance Taylor i...@google.com wrote: On Thu, Aug 9, 2012 at 9:39 AM, H.J. Lu hongjiu...@intel.com wrote: Bionic C library doesn't provide link.h. Does Bionic provide dl_iterate_phdr? If it does, I'll just note in passing that it would be straightforward to simply incorporate the required types and constants in unwind-dw2-fde-dip.c directly, and avoid the #include. If it doesn't, then of course nothing will make this code work correctly. dl_iterate_phdr is provided in libdl.so, which is always linked with dynamic executables: #define ANDROID_LIB_SPEC \ %{!static: -ldl} This patch fixes Android/x86 build on trunk. OK to install? Thanks. -- H.J. 2012-08-14 H.J. Lu hongjiu...@intel.com PR bootstrap/54209 * unwind-dw2-fde-dip.c (dl_phdr_info): New struct for Bionic C library. (ElfW): New macro for Bionic C library. Don't include link.h for Bionic C library. Wrong patch. Here is the right one. -- H.J. gcc-pr54209.patch Description: Binary data
Re: [patch] timevar TLC
On Tue, Aug 14, 2012 at 9:25 PM, Diego Novillo dnovi...@google.com wrote: This 32 is just currently the longest name length of all timevars (for straight-line strength reduction), but there are a few more long ones (PCH preprocessor state restore ...). I didn't look at the total length of the lines that -ftime-report produces now. It fits on my screen, that's what mattered to me :-) Heh, OK. I'd like to make it 80 - len(everything else). Most folks use 80 col windows. I seriously doubt that ;-) Anyway, it's not so simple, this 80-len(everything else). I was looking for a solution like that but it can't be done: there is no everything else. It depends on the configuration -- more specifically on HAVE_USER_TIME, HAVE_SYS_TIME, and HAVE_WALL_TIME. The format of these three is: %7.2f (%2.0f%%) usr %7.2f (%2.0f%%) sys %7.2f (%2.0f%%) wall So that's 7+2+2+6 + 7+2+2+6 + 7+2+2+7 = 52 (if I got the math right, if you have all three. In addition you have the memory stats: %8u kB (%2.0f%%) ggc. That's another 8+5+2+6=21. So you have 80-52-21=7. Take the : off and your maximum length for the timevar print names is 6 characters. So I'd really like to go with the patch as-is, it's certainly not making things any worse than they already are for those who work on good old 80col windows. Ciao! Steven
Re: [patch] timevar TLC
On 12-08-14 16:39 , Steven Bosscher wrote: I seriously doubt that ;-) Anyway, it's not so simple, this 80-len(everything else). I was looking for a solution like that but it can't be done: there is no everything else. It depends on the configuration -- more specifically on HAVE_USER_TIME, HAVE_SYS_TIME, and HAVE_WALL_TIME. The format of these three is: %7.2f (%2.0f%%) usr %7.2f (%2.0f%%) sys %7.2f (%2.0f%%) wall Sorry, I didn't mean to have it computed automatically. Apologies for the confusion. I really wanted to know if we couldn't do something like 40 or so. Chatting with Lawrence offline, he says that we may be able to do a static check instead of dynamic. But, again, this can be a follow-up patch. So I'd really like to go with the patch as-is, it's certainly not making things any worse than they already are for those who work on good old 80col windows. Absolutely. The patch is fine. Diego.
Re: [patch] timevar TLC
On 8/14/12, Steven Bosscher stevenb@gmail.com wrote: On Aug 14, 2012 Diego Novillo dnovi...@google.com wrote: On 12-08-14 14:26 , Steven Bosscher wrote: Many unused timevars, many timevars that measure completely different passes, passes with the wrong timevar, etc. Time for a bit of maintenance / janitorial. Bootstrappedtested on powerpc64-unknown-linux-gnu. OK for trunk? * timevar.def (TV_VARPOOL, TV_WHOPR_WPA_LTRANS_EXEC, TV_LIFE, TV_LIFE_UPDATE, TV_DF_UREC, TV_INLINE_HEURISTICS, TV_TREE_LINEAR_TRANSFORM, TV_TREE_LOOP_INIT, TV_TREE_LOOP_FINI, TV_VPT, TV_LOCAL_ALLOC, TV_GLOBAL_ALLOC, TV_SEQABSTR): Remove. (TV_IPA_INLINING, TV_FLATTEN_INLINING, TV_EARLY_INLINING, TV_INLINE_PARAMETERS, TV_LOOP_INIT, TV_LOOP_FINI): New. * timevar.c (timevar_print): Make printing width of timevar names more flexible, but enforce maximum length. * ipa-inline.c (pass_early_inline): Use TV_EARLY_INLINING. (pass_ipa_inline): Use TV_IPA_INLINING. * ipa-inline-analysis.c (pass_inline_parameters): Use TV_INLINE_HEURISTICS. * tree-ssa-loop.c (pass_tree_loop_init): No timevar for wrapper pass. (pass_tree_loop_done): Likewise. * final.c (pass_shorten_branches): Use TV_SHORTEN_BRANCH. * loop-init.c (loop_optimizer_init): Push/pop TV_LOOP_INIT. (loop_optimizer_finalize): Push/pop TV_LOOP_FINI. Looks fine, except: @@ -505,6 +507,16 @@ timevar_print (FILE *fp) TIMEVAR. */ start_time = now; +#ifdef ENABLE_CHECKING + /* Pester those who add timevars with too long names. */ + for (id = 0; id (unsigned int) TIMEVAR_LAST; ++id) +{ + struct timevar_def *tv = timevars[(timevar_id_t) id]; + if ((timevar_id_t) id != TV_TOTAL tv-used) + gcc_assert (strlen (tv-name) = name_width); +} +#endif I'm not liking this too much. I would rather do truncation or wrapping. Not ICEing. And we'd do this all the time, not just with checking enabled. Wrapping would be bad, it'd break scripts that parse the -ftime-report output. Truncation is a bit silly, too: If the name is always truncated, why have the long name in the first place? I chose for this gcc_assert solution to make absolutely sure nobody can add a timevar with an overlong name. I suppose this works with -ftime-report right? (I'm thinking about the new code that emits phase-level timers). Yes, I've been using -ftime-report a lot for PR54146, and the output format was itching... You can check the error statically. Something like % cat limitstring.c #define LIMIT 32 struct def { int x; char name[LIMIT+1]; }; struct def var[] = { { 3, hello }, { 4, name is much too too long for a reasonable name }, }; % gcc -c limitstring.c -Werror cc1: warnings being treated as errors limitstring.c:10: error: initializer-string for array of chars is too long limitstring.c:10: error: (near initialization for 'timevars[1].name') But of course the variable definition would look more like #define DEFTIMEVAR(identifier__, name__) \ { , name__, ... }, struct def var[] = { #include timevar.def }; -- Lawrence Crowl
[PATCH] Fix PR54240
Replace the once vacuously true, and now vacuously false, test for existence of a conditional move instruction for a given mode, with one that actually checks what it's supposed to. Add a test case so we don't miss such things in future. The test is powerpc-specific. It would be good to have an i386 version of the test as well, if someone can help with that. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Thanks, Bill gcc: 2012-08-13 Bill Schmidt wschm...@linux.vnet.ibm.com PR tree-optimization/54240 * tree-ssa-phiopt.c (hoist_adjacent_loads): Correct test for existence of conditional move with given mode. gcc/testsuite: 2012-08-13 Bill Schmidt wschm...@linux.vnet.ibm.com PR tree-optimization/54240 * gcc.target/powerpc/pr54240.c: New test. Index: gcc/testsuite/gcc.target/powerpc/pr54240.c === --- gcc/testsuite/gcc.target/powerpc/pr54240.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr54240.c (revision 0) @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -misel -fdump-tree-phiopt-details } */ + +typedef struct s { + int v; + int b; + struct s *l; + struct s *r; +} S; + + +int foo(S *s) +{ + S *this; + S *next; + + this = s; + if (this-b) +next = this-l; + else +next = this-r; + + return next-v; +} + +/* { dg-final { scan-tree-dump Hoisting adjacent loads phiopt1 } } */ +/* { dg-final { cleanup-tree-dump phiopt1 } } */ Index: gcc/tree-ssa-phiopt.c === --- gcc/tree-ssa-phiopt.c (revision 190305) +++ gcc/tree-ssa-phiopt.c (working copy) @@ -1843,7 +1843,8 @@ hoist_adjacent_loads (basic_block bb0, basic_block /* Check the mode of the arguments to be sure a conditional move can be generated for it. */ - if (!optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1 + if (optab_handler (movcc_optab, TYPE_MODE (TREE_TYPE (arg1))) + == CODE_FOR_nothing) continue; /* Both statements must be assignments whose RHS is a COMPONENT_REF. */
Re: [PATCH] Fix PR54240
On Tue, Aug 14, 2012 at 2:11 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: Replace the once vacuously true, and now vacuously false, test for existence of a conditional move instruction for a given mode, with one that actually checks what it's supposed to. Add a test case so we don't miss such things in future. The test is powerpc-specific. It would be good to have an i386 version of the test as well, if someone can help with that. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Here is one which can go into gcc.target/mips : /* { dg-do compile } */ /* { dg-options -O2 -fdump-tree-phiopt-details } */ typedef struct s { int v; int b; struct s *l; struct s *r; } S; int foo(S *s) { S *this; S *next; this = s; if (this-b) next = this-l; else next = this-r; return next-v; } /* { dg-final { scan-tree-dump Hoisting adjacent loads phiopt1 } } */ /* { dg-final { cleanup-tree-dump phiopt1 } } */
Re: [PATCH] Fix PR54240
On Tue, Aug 14, 2012 at 2:15 PM, Andrew Pinski pins...@gmail.com wrote: On Tue, Aug 14, 2012 at 2:11 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: Replace the once vacuously true, and now vacuously false, test for existence of a conditional move instruction for a given mode, with one that actually checks what it's supposed to. Add a test case so we don't miss such things in future. The test is powerpc-specific. It would be good to have an i386 version of the test as well, if someone can help with that. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Here is one which can go into gcc.target/mips : /* { dg-do compile } */ /* { dg-options -O2 -fdump-tree-phiopt-details } */ Sorry the dg-options should be: /* { dg-options -O2 -fdump-tree-phiopt-details isa=4 } */ Thanks, Andrew typedef struct s { int v; int b; struct s *l; struct s *r; } S; int foo(S *s) { S *this; S *next; this = s; if (this-b) next = this-l; else next = this-r; return next-v; } /* { dg-final { scan-tree-dump Hoisting adjacent loads phiopt1 } } */ /* { dg-final { cleanup-tree-dump phiopt1 } } */
Re: PATCH [x86_64] PR20020 - 128 bit structs not targeted to TImode
Attached, is an updated patch (with change logs). The test cases are now in gcc.target/i386 and the target selection is dg-require-effective-target int128 only. Verified that the tests correctly detect the presence/lack of TImode support. - Gary Index: gcc/config/i386/i386.h === --- gcc/config/i386/i386.h (revision 190398) +++ gcc/config/i386/i386.h (working copy) @@ -1816,6 +1816,10 @@ do { \ #define BRANCH_COST(speed_p, predictable_p) \ (!(speed_p) ? 2 : (predictable_p) ? 0 : ix86_branch_cost) +/* An integer expression for the size in bits of the largest integer machine + mode that should actually be used. We allow pairs of registers. */ +#define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TARGET_64BIT ? TImode : DImode) + /* Define this macro as a C expression which is nonzero if accessing less than a word of memory (i.e. a `char' or a `short') is no faster than accessing a word of memory, i.e., if such access Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 190398) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,10 @@ +2012-08-14 Gary Funck g...@intrepid.com + + PR target/20020 + * config/i386/i386.h (MAX_FIXED_MODE_SIZE): Allow use of TImode + for use with appropriately sized structures and unions + on 64-bit (x86) targets. + 2012-08-14 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (enabled): Add comment with explanation Index: gcc/testsuite/gcc.target/i386/pr20020-1.c === --- gcc/testsuite/gcc.target/i386/pr20020-1.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr20020-1.c (revision 0) @@ -0,0 +1,23 @@ +/* Check that 128-bit struct's are represented as TImode values. */ +/* { dg-require-effective-target int128 } */ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-rtl-expand } */ + +struct shared_ptr_struct +{ + unsigned long long phase:48; + unsigned short thread:16; + void *addr; +}; +typedef struct shared_ptr_struct sptr_t; + +sptr_t S; + +sptr_t +sptr_result (void) +{ + return S; +} +/* { dg-final { scan-rtl-dump \\\(set \\\(reg:TI \[0-9\]* \\\[ retval \\\]\\\) expand } } */ +/* { dg-final { scan-rtl-dump \\\(set \\\(reg/i:TI 0 ax\\\) expand } } */ +/* { dg-final { cleanup-rtl-dump expand } } */ Index: gcc/testsuite/gcc.target/i386/pr20020-2.c === --- gcc/testsuite/gcc.target/i386/pr20020-2.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr20020-2.c (revision 0) @@ -0,0 +1,21 @@ +/* Check that 128-bit struct's are represented as TImode values. */ +/* { dg-require-effective-target int128 } */ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-rtl-expand } */ + +struct shared_ptr_struct +{ + unsigned long long phase:48; + unsigned short thread:16; + void *addr; +}; +typedef struct shared_ptr_struct sptr_t; + +void +copy_sptr (sptr_t *dest, sptr_t src) +{ + *dest = src; +} + +/* { dg-final { scan-rtl-dump \\\(set \\\(reg:TI \[0-9\]* expand } } */ +/* { dg-final { cleanup-rtl-dump expand } } */ Index: gcc/testsuite/gcc.target/i386/pr20020-3.c === --- gcc/testsuite/gcc.target/i386/pr20020-3.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr20020-3.c (revision 0) @@ -0,0 +1,24 @@ +/* Check that 128-bit struct's are represented as TImode values. */ +/* { dg-require-effective-target int128 } */ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-rtl-expand } */ + +struct shared_ptr_struct +{ + unsigned long long phase:48; + unsigned short thread:16; + void *addr; +}; +typedef struct shared_ptr_struct sptr_t; + +sptr_t sptr_1, sptr_2; + +void +copy_sptr (void) +{ + sptr_1 = sptr_2; +} + +/* { dg-final { scan-rtl-dump \\\(set \\\(reg:TI \[0-9\]* expand } } */ +/* { dg-final { scan-rtl-dump \\\(set \\\(mem/c:TI expand } } */ +/* { dg-final { cleanup-rtl-dump expand } } */ Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 190398) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,3 +1,10 @@ +2012-08-14 Gary Funck g...@intrepid.com + + PR target/20020 + * gcc.target/i386/pr20020-1.c: New. + * gcc.target/i386/pr20020-2.c: New. + * gcc.target/i386/pr20020-3.c: New. + 2012-08-14 Oleg Endo olege...@gcc.gnu.org PR target/52933
Re: [wwwdocs] Announce switch to C++
On Tue, 14 Aug 2012, Diego Novillo wrote: OK to install after the merge? Yep, this looks good. Thanks! Gerald
Re: [PATCH] Fix PR54240
Thanks, Andrew! Bill On Tue, 2012-08-14 at 14:17 -0700, Andrew Pinski wrote: On Tue, Aug 14, 2012 at 2:15 PM, Andrew Pinski pins...@gmail.com wrote: On Tue, Aug 14, 2012 at 2:11 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: Replace the once vacuously true, and now vacuously false, test for existence of a conditional move instruction for a given mode, with one that actually checks what it's supposed to. Add a test case so we don't miss such things in future. The test is powerpc-specific. It would be good to have an i386 version of the test as well, if someone can help with that. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Here is one which can go into gcc.target/mips : /* { dg-do compile } */ /* { dg-options -O2 -fdump-tree-phiopt-details } */ Sorry the dg-options should be: /* { dg-options -O2 -fdump-tree-phiopt-details isa=4 } */ Thanks, Andrew typedef struct s { int v; int b; struct s *l; struct s *r; } S; int foo(S *s) { S *this; S *next; this = s; if (this-b) next = this-l; else next = this-r; return next-v; } /* { dg-final { scan-tree-dump Hoisting adjacent loads phiopt1 } } */ /* { dg-final { cleanup-tree-dump phiopt1 } } */
[Patch, Fortran] PR50269 - C_LOC fixes
The main purpose of this patch is to allow elements of assumed-shape arrays (which are scalars) and assumed-rank arrays with C_LOC. There are several other issues with the current C_LOC handling (and with C_F_POINTER), but I want to fix the most important reject-valid issues first as they block a project I am interested in. Build and regtested on x86-64-linux. OK for the trunk? Tobias 2012-08-14 Tobias Burnus bur...@net-b.de PR fortran/50269 * interface.c (gfc_procedure_use): Alloc assumed-rank arrays as argument to C_LOC. * resolve.c (gfc_iso_c_func_interface): Allow elements of assumed-shape/deferred-shape arrays with C_LOC. 2012-08-14 Tobias Burnus bur...@net-b.de PR fortran/50269 * gfortran.dg/c_loc_tests_17.f90: New. diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c index 482c294..4097ecc 100644 --- a/gcc/fortran/interface.c +++ b/gcc/fortran/interface.c @@ -3151,6 +3151,7 @@ gfc_procedure_use (gfc_symbol *sym, gfc_actual_arglist **ap, locus *where) /* TS 29113, C407b. */ if (a-expr a-expr-expr_type == EXPR_VARIABLE + sym-intmod_sym_id != ISOCBINDING_LOC symbol_rank (a-expr-symtree-n.sym) == -1) { gfc_error (Assumed-rank argument requires an explicit interface diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index c706b89..8aa8de8 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -2874,6 +2874,7 @@ gfc_iso_c_func_interface (gfc_symbol *sym, gfc_actual_arglist *args, { gfc_ref *ref; bool seen_section; + gfc_array_spec *as = args-expr-symtree-n.sym-as; /* Make sure we have either the target or pointer attribute. */ if (!arg_attr.target !arg_attr.pointer) @@ -2901,6 +2902,8 @@ gfc_iso_c_func_interface (gfc_symbol *sym, gfc_actual_arglist *args, { if (ref-type == REF_ARRAY) { + as = ref-u.ar.as; + if (ref-u.ar.type == AR_SECTION) seen_section = true; @@ -2953,9 +2956,9 @@ gfc_iso_c_func_interface (gfc_symbol *sym, gfc_actual_arglist *args, /* A non-allocatable target variable with C interoperable type and type parameters must be interoperable. */ - if (args_sym args_sym-attr.dimension) + if (args_sym args-expr-rank != 0) { - if (args_sym-as-type == AS_ASSUMED_SHAPE) + if (as-type == AS_ASSUMED_SHAPE) { gfc_error (Assumed-shape array '%s' at %L cannot be an argument to the @@ -2965,7 +2968,7 @@ gfc_iso_c_func_interface (gfc_symbol *sym, gfc_actual_arglist *args, (args-expr-where), sym-name); retval = FAILURE; } - else if (args_sym-as-type == AS_DEFERRED) + else if (as-type == AS_DEFERRED) { gfc_error (Deferred-shape array '%s' at %L cannot be an argument to the --- /dev/null 2012-08-08 07:41:43.631684108 +0200 +++ gcc/gcc/testsuite/gfortran.dg/c_loc_tests_17.f90 2012-08-14 23:11:37.0 +0200 @@ -0,0 +1,35 @@ +! { dg-do run } +! +! Check that C_LOC (assumed-rank) works (valid TS29113) +! and that ! taking an element of an assumed-shape/deferred-shape +! array works (valid since Fortran 2003) +! + +integer, target :: a(5) +a = [34, 7383, 378, 393, -3] +call foo ([11,22,33], [-3,-5,-8,-33], a, [11,22,33]) +contains +subroutine foo(x, y, z, val) + use iso_c_binding + integer :: val(:) + type(*), target :: x(..) + integer, target :: y(:) + integer, pointer, intent(in) :: z(:) + type(c_ptr) :: p + p = c_loc (x) + call check (p, val) + p = c_loc (y(1)) + call check (p, y) + p = c_loc (z(2)) + call check (p, z(2:)) +end subroutine foo + +subroutine check (p, val) + use iso_c_binding + type(c_ptr) :: p + integer :: val(:) + integer, pointer :: iptr(:) + call c_f_pointer (p, iptr, shape=shape(val)) + if (any (iptr /= val)) call abort () +end subroutine check +end
[PATCH] Fix PR54245
Currently we can insert an initializer that performs a multiply in too small of a type for correctness. For now, detect the problem and avoid the optimization when this would happen. Eventually I will fix this up to cause the multiply to be performed in a sufficiently wide type. Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new regressions. Ok for trunk? Thanks, Bill gcc: 2012-08-14 Bill Schmidt wschm...@linux.vnet.ibm.com PR tree-optimization/54245 * gimple-ssa-strength-reduction.c (legal_cast_p_1): New function. (legal_cast_p): Split out logic to legal_cast_p_1. (analyze_increments): Avoid introducing multiplies in smaller types. gcc/testsuite: 2012-08-14 Bill Schmidt wschm...@linux.vnet.ibm.com PR tree-optimization/54245 * gcc.dg/tree-ssa/pr54245.c: New test. Index: gcc/testsuite/gcc.dg/tree-ssa/pr54245.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr54245.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr54245.c (revision 0) @@ -0,0 +1,49 @@ +/* { dg-do compile } */ +/* { dg-options -O1 -fdump-tree-slsr-details } */ + +#include stdio.h + +#define W1 22725 +#define W2 21407 +#define W3 19266 +#define W6 8867 + +void idct_row(short *row, int *dst) +{ +int a0, a1, b0, b1; + +a0 = W1 * row[0]; +a1 = a0; + +a0 += W2 * row[2]; +a1 += W6 * row[2]; + +b0 = W1 * row[1]; +b1 = W3 * row[1]; + +dst[0] = a0 + b0; +dst[1] = a0 - b0; +dst[2] = a1 + b1; +dst[3] = a1 - b1; +} + +static short block[8] = { 1, 2, 3, 4 }; + +int main(void) +{ +int out[4]; +int i; + +idct_row(block, out); + +for (i = 0; i 4; i++) +printf(%d\n, out[i]); + +return !(out[2] == 87858 out[3] == 10794); +} + +/* For now, disable inserting an initializer when the multiplication will + take place in a smaller type than originally. This test may be deleted + in future when this case is handled more precisely. */ +/* { dg-final { scan-tree-dump-times Inserting initializer 0 slsr } } */ +/* { dg-final { cleanup-tree-dump slsr } } */ Index: gcc/gimple-ssa-strength-reduction.c === --- gcc/gimple-ssa-strength-reduction.c (revision 190305) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -1089,6 +1089,32 @@ slsr_process_neg (gimple gs, tree rhs1, bool speed add_cand_for_stmt (gs, c); } +/* Help function for legal_cast_p, operating on two trees. Checks + whether it's allowable to cast from RHS to LHS. See legal_cast_p + for more details. */ + +static bool +legal_cast_p_1 (tree lhs, tree rhs) +{ + tree lhs_type, rhs_type; + unsigned lhs_size, rhs_size; + bool lhs_wraps, rhs_wraps; + + lhs_type = TREE_TYPE (lhs); + rhs_type = TREE_TYPE (rhs); + lhs_size = TYPE_PRECISION (lhs_type); + rhs_size = TYPE_PRECISION (rhs_type); + lhs_wraps = TYPE_OVERFLOW_WRAPS (lhs_type); + rhs_wraps = TYPE_OVERFLOW_WRAPS (rhs_type); + + if (lhs_size rhs_size + || (rhs_wraps !lhs_wraps) + || (rhs_wraps lhs_wraps rhs_size != lhs_size)) +return false; + + return true; +} + /* Return TRUE if GS is a statement that defines an SSA name from a conversion and is legal for us to combine with an add and multiply in the candidate table. For example, suppose we have: @@ -1129,28 +1155,11 @@ slsr_process_neg (gimple gs, tree rhs1, bool speed static bool legal_cast_p (gimple gs, tree rhs) { - tree lhs, lhs_type, rhs_type; - unsigned lhs_size, rhs_size; - bool lhs_wraps, rhs_wraps; - if (!is_gimple_assign (gs) || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (gs))) return false; - lhs = gimple_assign_lhs (gs); - lhs_type = TREE_TYPE (lhs); - rhs_type = TREE_TYPE (rhs); - lhs_size = TYPE_PRECISION (lhs_type); - rhs_size = TYPE_PRECISION (rhs_type); - lhs_wraps = TYPE_OVERFLOW_WRAPS (lhs_type); - rhs_wraps = TYPE_OVERFLOW_WRAPS (rhs_type); - - if (lhs_size rhs_size - || (rhs_wraps !lhs_wraps) - || (rhs_wraps lhs_wraps rhs_size != lhs_size)) -return false; - - return true; + return legal_cast_p_1 (gimple_assign_lhs (gs), rhs); } /* Given GS which is a cast to a scalar integer type, determine whether @@ -1996,6 +2005,31 @@ analyze_increments (slsr_cand_t first_dep, enum ma != POINTER_PLUS_EXPR))) incr_vec[i].cost = COST_NEUTRAL; + /* FORNOW: If we need to add an initializer, give up if a cast from +the candidate's type to its stride's type can lose precision. +This could eventually be handled better by expressly retaining the +result of a cast to a wider type in the stride. Example: + + short int _1; + _2 = (int) _1; + _3 = _2 * 10; + _4 = x + _3;ADD: x + (10 * _1) : int + _5 = _2 * 15; + _6 = x + _3;ADD: x + (15 * _1) : int + + Right now replacing _6
Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86
On 15/08/2012, at 7:39 AM, H.J. Lu wrote: On Tue, Aug 14, 2012 at 12:38 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Aug 9, 2012 at 3:17 PM, Ian Lance Taylor i...@google.com wrote: On Thu, Aug 9, 2012 at 9:39 AM, H.J. Lu hongjiu...@intel.com wrote: Bionic C library doesn't provide link.h. Does Bionic provide dl_iterate_phdr? If it does, I'll just note in passing that it would be straightforward to simply incorporate the required types and constants in unwind-dw2-fde-dip.c directly, and avoid the #include. If it doesn't, then of course nothing will make this code work correctly. dl_iterate_phdr is provided in libdl.so, which is always linked with dynamic executables: #define ANDROID_LIB_SPEC \ %{!static: -ldl} This patch fixes Android/x86 build on trunk. OK to install? [Adding David Turner to CC as the main Bionic expert. Also reattaching HJ's current patch so that David can easily look at it. Link to the bug report: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54209] I think this patch will break MIPS Android build due to mismatch of ElfW(type) when _MIPS_SZPTR == 64. I think the right way to fix this is to make Bionic export link.h or already-existing linker.h, but I differ to Ian for final judgement. FWIW, I'm OK with using hard-coded definitions if link.h is absent, and using definitions from link.h if it is there. I.e., #ifdef HAVE_LINK_H # include link.h #else YOUR PATCH #endif This would allow Bionic to eventually catch up and provide link.h uniformly across all targets. I've looked into latest Android NDK distribution, and the situation with link.h is not uniform across targets. ARM and x86 don't have link.h, while MIPS does: --- /* For building unwind-dw2-fde-glibc.c for MIPS frame unwinding, we need to have link.h that defines struct dl_phdr_info, ELFW(type), and dl_iterate_phdr(). */ #include sys/types.h #include elf.h struct dl_phdr_info { Elf32_Addr dlpi_addr; const char *dlpi_name; const Elf32_Phdr *dlpi_phdr; Elf32_Half dlpi_phnum; }; #if _MIPS_SZPTR == 32 #define ElfW(type) Elf32_##type #elif _MIPS_SZPTR == 64 #define ElfW(type) Elf64_##type #endif int dl_iterate_phdr(int (*cb)(struct dl_phdr_info *info, size_t size, void *data), void *data); --- I'm not 100% sure where the above link.h comes from for MIPS, but since it's not present in Bionic sources, my guess is kernel's arch/mips/include directory. Checking ... No, not from the kernel sources. Hm... -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics 2012-08-14 H.J. Lu hongjiu...@intel.com PR bootstrap/54209 * unwind-dw2-fde-dip.c (dl_phdr_info): New struct for Bionic C library. (ElfW): New macro for Bionic C library. Don't include link.h for Bionic C library. Wrong patch. Here is the right one. -- H.J. gcc-pr54209.patch gcc-pr54209.patch Description: Binary data
Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86
On Tue, Aug 14, 2012 at 3:47 PM, Maxim Kuvyrkov ma...@codesourcery.com wrote: I think this patch will break MIPS Android build due to mismatch of ElfW(type) when _MIPS_SZPTR == 64. I think the right way to fix this is to make Bionic export link.h or already-existing linker.h, but I differ to Ian for final judgement. I think it would be better to export link.h. I don't know how feasible that is or how long it would take to become available. FWIW, I'm OK with using hard-coded definitions if link.h is absent, and using definitions from link.h if it is there. I.e., #ifdef HAVE_LINK_H # include link.h #else YOUR PATCH #endif This is conceptually fine as long as we are clear that we are testing for the presence of link.h on the target, not the host. It can be hard for libgcc to reliably test for the presence of target-specific header files. Ian
Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86
On Tue, Aug 14, 2012 at 3:47 PM, Maxim Kuvyrkov ma...@codesourcery.com wrote: On 15/08/2012, at 7:39 AM, H.J. Lu wrote: On Tue, Aug 14, 2012 at 12:38 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Aug 9, 2012 at 3:17 PM, Ian Lance Taylor i...@google.com wrote: On Thu, Aug 9, 2012 at 9:39 AM, H.J. Lu hongjiu...@intel.com wrote: Bionic C library doesn't provide link.h. Does Bionic provide dl_iterate_phdr? If it does, I'll just note in passing that it would be straightforward to simply incorporate the required types and constants in unwind-dw2-fde-dip.c directly, and avoid the #include. If it doesn't, then of course nothing will make this code work correctly. dl_iterate_phdr is provided in libdl.so, which is always linked with dynamic executables: #define ANDROID_LIB_SPEC \ %{!static: -ldl} This patch fixes Android/x86 build on trunk. OK to install? [Adding David Turner to CC as the main Bionic expert. Also reattaching HJ's current patch so that David can easily look at it. Link to the bug report: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54209] I think this patch will break MIPS Android build due to mismatch of ElfW(type) when _MIPS_SZPTR == 64. I think the right way to fix this is to make Bionic export link.h or already-existing linker.h, but I differ to Ian for final judgement. FWIW, I'm OK with using hard-coded definitions if link.h is absent, and using definitions from link.h if it is there. I.e., #ifdef HAVE_LINK_H # include link.h #else YOUR PATCH #endif This would allow Bionic to eventually catch up and provide link.h uniformly across all targets. I've looked into latest Android NDK distribution, and the situation with link.h is not uniform across targets. ARM and x86 don't have link.h, while MIPS does: --- /* For building unwind-dw2-fde-glibc.c for MIPS frame unwinding, we need to have link.h that defines struct dl_phdr_info, ELFW(type), and dl_iterate_phdr(). */ #include sys/types.h #include elf.h struct dl_phdr_info { Elf32_Addr dlpi_addr; const char *dlpi_name; const Elf32_Phdr *dlpi_phdr; Elf32_Half dlpi_phnum; }; #if _MIPS_SZPTR == 32 #define ElfW(type) Elf32_##type #elif _MIPS_SZPTR == 64 #define ElfW(type) Elf64_##type #endif int dl_iterate_phdr(int (*cb)(struct dl_phdr_info *info, size_t size, void *data), void *data); --- I'm not 100% sure where the above link.h comes from for MIPS, but since it's not present in Bionic sources, my guess is kernel's arch/mips/include directory. Checking ... No, not from the kernel sources. Hm... -- Bionic is a 32-bit library. I don't know how _MIPS_SZPTR == 64 works with Bionic on mips. -- H.J.
Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86
On Tue, Aug 14, 2012 at 4:27 PM, Ian Lance Taylor i...@google.com wrote: On Tue, Aug 14, 2012 at 3:47 PM, Maxim Kuvyrkov ma...@codesourcery.com wrote: I think this patch will break MIPS Android build due to mismatch of ElfW(type) when _MIPS_SZPTR == 64. I think the right way to fix this is to make Bionic export link.h or already-existing linker.h, but I differ to Ian for final judgement. I think it would be better to export link.h. I don't know how feasible that is or how long it would take to become available. Pavel, how long does it take to export link.h for Android/x86? FWIW, I'm OK with using hard-coded definitions if link.h is absent, and using definitions from link.h if it is there. I.e., #ifdef HAVE_LINK_H # include link.h #else YOUR PATCH #endif This is conceptually fine as long as we are clear that we are testing for the presence of link.h on the target, not the host. It can be hard for libgcc to reliably test for the presence of target-specific header files. That is also my concern. -- H.J.
[contrib] Add .xfail file for x86_64
This patch adds an xfail manifest for trunk for x86_64 builds. I find this useful to determine whether my patch has introduced new failures. The failures in these manifest are always present in trunk and deciding what to ignore is not very straightforward. I will keep maintaining this manifest out of clean builds. They are not hard to maintain. Manifest files can be generated by going to the top of the build directory and typing: $ cd top-of-bld-dir $ path-to-src/contrib/testsuite-management/validate_failures.py --produce_manifest This will generate a .xfail file with the triple name of the target you just built. Once this file exist you can run the validator again on the build directory with no arguments. It should produce the output: $ cd top-of-bld-dir $ path-to-src/contrib/testsuite-management/validate_failures.py Source directory: path-to-src Build target: x86_64-unknown-linux-gnu Manifest: path-to-src/contrib/testsuite-management/x86_64-unknown-linux-gnu.xfail Getting actual results from build ./x86_64-unknown-linux-gnu/libstdc++-v3/testsuite/libstdc++.sum ./x86_64-unknown-linux-gnu/libffi/testsuite/libffi.sum ./x86_64-unknown-linux-gnu/libgomp/testsuite/libgomp.sum ./x86_64-unknown-linux-gnu/libgo/libgo.sum ./x86_64-unknown-linux-gnu/boehm-gc/testsuite/boehm-gc.sum ./x86_64-unknown-linux-gnu/libatomic/testsuite/libatomic.sum ./x86_64-unknown-linux-gnu/libmudflap/testsuite/libmudflap.sum ./x86_64-unknown-linux-gnu/libitm/testsuite/libitm.sum ./x86_64-unknown-linux-gnu/libjava/testsuite/libjava.sum ./gcc/testsuite/g++/g++.sum ./gcc/testsuite/gnat/gnat.sum ./gcc/testsuite/ada/acats/acats.sum ./gcc/testsuite/gcc/gcc.sum ./gcc/testsuite/gfortran/gfortran.sum ./gcc/testsuite/obj-c++/obj-c++.sum ./gcc/testsuite/go/go.sum ./gcc/testsuite/objc/objc.sum SUCCESS: No unexpected failures. If the output shows new failures, you investigate them. If they are not yours, you can add them to the xfail manifest (after reporting them) and then commit the modified .xfail file. Long term, I would like to have this script pull manifest files from postings made to gcc-testresults. This way, we won't have to maintain these .xfail files manually. In branches this is not a big problem, but in trunk it may be a tad annoying. Committed to trunk. 2012-08-14 Diego Novillo dnovi...@google.com * testsuite-management/x86_64-unknown-linux-gnu.xfail: New. * testsuite-management/validate_failures.py (ExpirationDate): Tidy Index: testsuite-management/x86_64-unknown-linux-gnu.xfail === --- testsuite-management/x86_64-unknown-linux-gnu.xfail (revision 0) +++ testsuite-management/x86_64-unknown-linux-gnu.xfail (revision 0) @@ -0,0 +1,78 @@ +FAIL: gcc.dg/attr-weakref-1.c (test for excess errors) +FAIL: gcc.dg/torture/pr51106-2.c -O0 (internal compiler error) +FAIL: gcc.dg/torture/pr51106-2.c -O0 (test for excess errors) +FAIL: gcc.dg/torture/pr51106-2.c -O1 (internal compiler error) +FAIL: gcc.dg/torture/pr51106-2.c -O1 (test for excess errors) +FAIL: gcc.dg/torture/pr51106-2.c -O2 (internal compiler error) +FAIL: gcc.dg/torture/pr51106-2.c -O2 (test for excess errors) +FAIL: gcc.dg/torture/pr51106-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error) +FAIL: gcc.dg/torture/pr51106-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) +FAIL: gcc.dg/torture/pr51106-2.c -O3 -fomit-frame-pointer (internal compiler error) +FAIL: gcc.dg/torture/pr51106-2.c -O3 -fomit-frame-pointer (test for excess errors) +FAIL: gcc.dg/torture/pr51106-2.c -O3 -g (internal compiler error) +FAIL: gcc.dg/torture/pr51106-2.c -O3 -g (test for excess errors) +FAIL: gcc.dg/torture/pr51106-2.c -Os (internal compiler error) +FAIL: gcc.dg/torture/pr51106-2.c -Os (test for excess errors) +FAIL: gfortran.dg/lto/pr45586 f_lto_pr45586_0.o-f_lto_pr45586_0.o link, -O0 -flto -flto-partition=1to1 -fno-use-linker-plugin (internal compiler error) +FAIL: gfortran.dg/lto/pr45586 f_lto_pr45586_0.o-f_lto_pr45586_0.o link, -O0 -flto -flto-partition=none -fuse-linker-plugin (internal compiler error) +FAIL: gfortran.dg/lto/pr45586 f_lto_pr45586_0.o-f_lto_pr45586_0.o link, -O0 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error) +FAIL: gfortran.dg/lto/pr45586-2 f_lto_pr45586-2_0.o-f_lto_pr45586-2_0.o link, -O0 -flto -flto-partition=1to1 -fno-use-linker-plugin (internal compiler error) +FAIL: gfortran.dg/lto/pr45586-2 f_lto_pr45586-2_0.o-f_lto_pr45586-2_0.o link, -O0 -flto -flto-partition=none -fuse-linker-plugin (internal compiler error) +FAIL: gfortran.dg/lto/pr45586-2 f_lto_pr45586-2_0.o-f_lto_pr45586-2_0.o link, -O0 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error) +FAIL: gnat.dg/array11.adb (test for warnings,
[PATCH/MIPS] Use ins/dins instruction when written manually
Hi, Right now we only produce ins when a zero_extract is used on the right hand side. We can do better by adding some patterns which combine for the ins instruction. This patch adds those patterns and a testcase which shows a simple example where the code is improved. OK? Bootstrapped and tested on mips64-linux-gnu with no regressions. Thanks, Andrew Pinski ChangeLog: Andrew Pinski apin...@cavium.com Adam Nemet ane...@caviumnetworks.com * config/mips/mips-protos.h (mips_bitmask, mips_bitmask_p, mips_bottom_bitmask_p): Declare them. * config/mips/mips.c (mips_bitmask, mips_bitmask_p, mips_bottom_bitmask_p): New functions. * config/mips/mips.md (*insvmode_internal1): New pattern to match a bottom insert. (*insvmode_internal2): Likewise. (*insvmode_internal3): New pattern to match an insert. (*insvmode_internal4): Likewise. * config/mips/predicates.md (bitmask_operand bottom_bitmask_operand, inverse_bitmask_operand): New predicates. * testsuite/gcc.target/mips/ins-4.c: New testcase. Index: testsuite/gcc.target/mips/ins-4.c === --- testsuite/gcc.target/mips/ins-4.c (revision 0) +++ testsuite/gcc.target/mips/ins-4.c (revision 0) @@ -0,0 +1,22 @@ +/* { dg-options -O2 isa_rev=2 -mgp64 } */ +/* { dg-final { scan-assembler-times ins\t 2 } } */ +/* { dg-final { scan-assembler-not or\t } } */ +/* { dg-final { scan-assembler-not cins\t } } */ + +#define shift 0 +#define mask (0xfullshift) + +/* Check that simple ins are produced by manually doing + bitfield insertations (no shifts in this case). */ + +NOMIPS16 int f(int a, int b) +{ + a = (a~mask) | ((bshift)mask); + return a; +} + +NOMIPS16 long long fll(long long a, long long b) +{ + a = (a~mask) | ((bshift)mask); + return a; +} Index: config/mips/predicates.md === --- config/mips/predicates.md (revision 190403) +++ config/mips/predicates.md (working copy) @@ -105,6 +105,22 @@ (define_predicate low_bitmask_operand (match_code const_int) (match_test low_bitmask_len (mode, INTVAL (op)) 16))) +(define_predicate bitmask_operand + (and (match_code const_int) + (and (not (match_operand 0 uns_arith_operand)) + (match_test mips_bitmask_p (INTVAL (op)) + +(define_predicate bottom_bitmask_operand + (and (match_code const_int) + (and (match_operand 0 bitmask_operand) + (match_test mips_bottom_bitmask_p (INTVAL (op)) + +(define_predicate inverse_bitmask_operand + (and (match_code const_int) + (and (not (match_operand 0 uns_arith_operand)) + (and (not (match_operand 0 bottom_bitmask_operand)) +(match_test mips_bitmask_p (~ INTVAL (op))) + (define_predicate and_reg_operand (ior (match_operand 0 register_operand) (and (not (match_test TARGET_MIPS16)) Index: config/mips/mips.md === --- config/mips/mips.md (revision 190403) +++ config/mips/mips.md (working copy) @@ -3903,6 +3903,112 @@ (define_insn *cinsmode [(set_attr type shift) (set_attr mode MODE)]) +(define_insn *insvmode_internal1 + [(set (match_operand:GPR 0 register_operand =d) +(ior:GPR (and:GPR (match_operand:GPR 1 register_operand 0) + (match_operand:GPR 2 const_int_operand i)) + (and:GPR (match_operand:GPR 3 register_operand d) + (match_operand:GPR 4 const_int_operand i] + ISA_HAS_EXT_INS mips_bottom_bitmask_p (INTVAL (operands[4])) +INTVAL(operands[2]) == ~INTVAL(operands[4]) +{ + int len, pos; + pos = mips_bitmask (INTVAL (operands[4]), len, MODEmode); + operands[2] = GEN_INT (pos); + operands[4] = GEN_INT (len); + return dins\t%0,%3,%2,%4; +} + [(set_attr type arith) + (set_attr mode MODE)]) + +(define_insn *insvmode_internal2 + [(set (match_operand:GPR 0 register_operand =d) +(ior:GPR (and:GPR (match_operand:GPR 1 register_operand d) + (match_operand:GPR 2 const_int_operand i)) + (and:GPR (match_operand:GPR 3 register_operand 0) + (match_operand:GPR 4 const_int_operand i] + ISA_HAS_EXT_INS mips_bottom_bitmask_p (INTVAL (operands[2])) +INTVAL(operands[2]) == ~INTVAL(operands[4]) +{ + int len, pos; + pos = mips_bitmask (INTVAL (operands[2]), len, MODEmode); + operands[2] = GEN_INT (pos); + operands[4] = GEN_INT (len); + return dins\t%0,%1,%2,%4; +} + [(set_attr type arith) + (set_attr mode MODE)]) + +(define_insn *insvmode_internal3 + [(set (match_operand:GPR 0 register_operand =d) +(ior:GPR (and:GPR (match_operand:GPR 1 register_operand 0) + (match_operand:GPR 2 const_int_operand i)) + (and:GPR (ashift:GPR
Go patch committed: Update for C++
This patch to the Go frontend, from Diego, updates it for the conversion of GCC to building with C++. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 681a1ae3f72c go/expressions.cc --- a/go/expressions.cc Fri Aug 10 21:07:57 2012 -0700 +++ b/go/expressions.cc Tue Aug 14 19:38:07 2012 -0700 @@ -10,11 +10,6 @@ #include gmp.h -#ifndef ENABLE_BUILD_WITH_CXX -extern C -{ -#endif - #include toplev.h #include intl.h #include tree.h @@ -24,10 +19,6 @@ #include real.h #include realmpfr.h -#ifndef ENABLE_BUILD_WITH_CXX -} -#endif - #include go-c.h #include gogo.h #include types.h diff -r 681a1ae3f72c go/gogo-tree.cc --- a/go/gogo-tree.cc Fri Aug 10 21:07:57 2012 -0700 +++ b/go/gogo-tree.cc Tue Aug 14 19:38:07 2012 -0700 @@ -8,11 +8,6 @@ #include gmp.h -#ifndef ENABLE_BUILD_WITH_CXX -extern C -{ -#endif - #include toplev.h #include tree.h #include gimple.h @@ -22,12 +17,8 @@ #include convert.h #include output.h #include diagnostic.h +#include go-c.h -#ifndef ENABLE_BUILD_WITH_CXX -} -#endif - -#include go-c.h #include types.h #include expressions.h #include statements.h diff -r 681a1ae3f72c go/types.cc --- a/go/types.cc Fri Aug 10 21:07:57 2012 -0700 +++ b/go/types.cc Tue Aug 14 19:38:07 2012 -0700 @@ -8,11 +8,6 @@ #include gmp.h -#ifndef ENABLE_BUILD_WITH_CXX -extern C -{ -#endif - #include toplev.h #include intl.h #include tree.h @@ -20,10 +15,6 @@ #include real.h #include convert.h -#ifndef ENABLE_BUILD_WITH_CXX -} -#endif - #include go-c.h #include gogo.h #include operator.h
Re: [PATCH, MIPS] 74k madd scheduler tweaks
On 14/08/2012, at 7:08 PM, Richard Sandiford wrote: OK with those changes, thanks. Checked in with the noted changes and a fixed bug. It turns out that mips_linked_madd_p is also called via mips_macc_chains_reorder, which may pass a (use ...) instruction, which causes get_attr_* to blow up. Fixed by adding if (recog_memoized (in_insn) 0) return false; at the beginning of mips_linked_madd_p. Richard, thanks for review, and thanks Sandra for testing the patch. -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics