Contents of PO file 'cpplib-4.9-b20140202.ja.po'
cpplib-4.9-b20140202.ja.po.gz Description: Binary data The Translation Project robot, in the name of your translation coordinator. coordina...@translationproject.org
New Japanese PO file for 'cpplib' (version 4.9-b20140202)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'cpplib' has been submitted by the Japanese team of translators. The file is available at: http://translationproject.org/latest/cpplib/ja.po (This file, 'cpplib-4.9-b20140202.ja.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: http://translationproject.org/latest/cpplib/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: http://translationproject.org/domain/cpplib.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator. coordina...@translationproject.org
Re: [PATCH] RTEMS: select SPARC multilibs
Hello Daniel, thanks for the patch. On 06/11/14 16:36, Daniel Hellstrom wrote: Recent support for mcpu=leon3v7 and muser-mode were added to GCC. Update the RTEMS multilib for sparc to the following combinations: v7- ./ leon3 muser-mode - leon3/user-mode/ leon3v7 muser-mode- leon3v7/user-mode/ v8- v8/ v7 soft-float - soft/ leon3 soft-float muser-mode - soft/leon3/user-mode/ leon3v7 soft-float muser-mode - soft/leon3v7/user-mode/ v8 soft-float - soft/v8/ I think this would be good for 4.8, 4.9 and trunk. 2014-11-06 Daniel Hellstromdan...@gaisler.com * config.gcc (sparc-*-rtems*): Clean away unused t-elf * config/sparc/t-rtems: Add leon3v7 and muser-mode multilibs I tested this patch with the GCC 4.9 branch and it yields exactly the set of multilibs we need in RTEMS to support the currently available hardware. I compiled the RTEMS testsuite with this compiler and run it on an NGMP board. The results are all right. -- Sebastian Huber, embedded brains GmbH Address : Dornierstr. 4, D-82178 Puchheim, Germany Phone : +49 89 189 47 41-16 Fax : +49 89 189 47 41-09 E-Mail : sebastian.hu...@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
the nvptx port
Hi Bernd, reading the patches, it seems like there is no mention of sm_35, only sm_30. So, I'm wondering what 'sub'targets will initially be supported, and if/how/when various processors will be selected. Thanks, Joost
Re: [PATCH] RTEMS: select SPARC multilibs
I think this would be good for 4.8, 4.9 and trunk. 2014-11-06 Daniel Hellstrom dan...@gaisler.com * config.gcc (sparc-*-rtems*): Clean away unused t-elf * config/sparc/t-rtems: Add leon3v7 and muser-mode multilibs OK everywhere as far as I'm concerned but the RTEMS folks have the final say. -- Eric Botcazou
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
I worked on what I suspect is similar stuff. I ran into the problem..pardon me if my terminology is wrong..PLT thunks for nested functions trashed registers that were in use. My solution was to mark them hidden or whatever is the term for not replaceable...also not exported but I recall not replaceable is more important. - Jay On Nov 6, 2014, at 11:38 PM, Richard Henderson r...@redhat.com wrote: On 11/06/2014 06:45 PM, Ian Taylor wrote: On Thu, Nov 6, 2014 at 5:04 AM, Richard Henderson r...@redhat.com wrote: That said, this *may* not actually be a problem. It's not the direct (possibly lazy bound) call into libffi that needs a static chain, it's the indirect call that libffi produces. And the indirect calls that Go produces. I'm pretty sure that there are no dynamically linked Go calls that require the static chain. They're used for closures, which are either fully indirect from a different translation unit, or locally bound closures through which the optimizer has seen the construction, and optimized to a direct call. Ian, have I missed a case where a closure could wind up with a direct call to a lazy bound function? I think you've covered all the cases. The closure value is only required when calling a nested function. There is no way to refer directly to a nested function defined in a different shared library. The only way you can get such a reference is if some function in that shared library returns it. Sorry, I wasn't clear. I know nested functions must be local. I'm asking about Go closures, supposing we go ahead with the change to make them use the static chain register. I'm merely pretty sure that calling a closure is either fully indirect or local direct. Certainly there are cases in the testsuite where -O3 is able to look through the creation of a closure and have a direct call to the function. Given that closures are custom created for the data at the creation site, it seems unlikely that the optimizer could look through that and come up with a dynamically bound function. r~
Re: [gofrontend-dev] [PATCH 4/4] Gccgo port to s390[x] -- part II
On Thu, Nov 06, 2014 at 09:06:18AM -0800, Ian Taylor wrote: On Thu, Nov 6, 2014 at 4:04 AM, Dominik Vogt v...@linux.vnet.ibm.com wrote: On Tue, Nov 04, 2014 at 08:16:51PM -0800, Ian Taylor wrote: The way to do it is not by copying the test. If the test needs to be customized, add additional files that use // +build lines to pick which files is built. Move them into a directory, like method4.go or other tests that use rundir. Currently go-test.exp does not look at the build lines of the files in subdirectories. Before I add that to the gcc testsuite start adding that, is it certain that the golang testsuite will be able to understand that and compile only the requested files? Hmmm, that is a good point. The testsuite doesn't use the go command to build the files in subdirectories, so it won't honor the +build lines. I didn't think of that. Sorry for pointing you in the wrong direction. That's no problem, I can enhance go-test.exp in Gcc. The question is if test cases extended in such a way would run in the master Go repository too. Are the tests there run with the Go tool? Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany
[PATCH][12/n] Merge from match-and-simplify, pointer-plus patterns and forwprop re-org
This interleaves stmt folding and manual simplifications done in forwprop into a single loop over all basic-blocks. It somewhat complicates things as we need to make sure the lattice stays valid when releasing SSA names from old code or when purging dead EH edges (which we now delay). But this ensures we don't regress by exposing dependences between the transforms still in forwprop and those we moved to patterns. This patch also goes forward and implements the POINTER_PLUS_EXPR patterns from tree-ssa-forwprop.c as patterns. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2014-11-07 Richard Biener rguent...@suse.de * match.pd: Add patterns for POINTER_PLUS_EXPR association and special patterns from tree-ssa-forwprop.c * fold-const.c (fold_binary_loc): Remove them here. * tree-ssa-forwprop.c (to_purge): New global bitmap. (fwprop_set_lattice_val): New function. (fwprop_invalidate_lattice): Likewise. (remove_prop_source_from_use): Instead of purging dead EH edges record blocks to do that in to_purge. (tidy_after_forward_propagate_addr): Likewise. (forward_propagate_addr_expr): Invalidate the lattice for SSA names we release. (simplify_conversion_from_bitmask): Likewise. (simplify_builtin_call): Likewise. (associate_pointerplus_align): Remove. (associate_pointerplus_diff): Likewise. (associate_pointerplus): Likewise. (fold_all_stmts): Merge with ... (pass_forwprop::execute): ... the original loop over all basic-blocks. Delay purging dead EH edges and invalidate the lattice for SSA names we release. Index: trunk/gcc/fold-const.c === *** trunk.orig/gcc/fold-const.c 2014-11-06 10:46:21.679593734 +0100 --- trunk/gcc/fold-const.c 2014-11-06 10:49:46.722584761 +0100 *** fold_binary_loc (location_t loc, *** 10009,10018 return NULL_TREE; case POINTER_PLUS_EXPR: - /* 0 +p index - (type)index */ - if (integer_zerop (arg0)) - return non_lvalue_loc (loc, fold_convert_loc (loc, type, arg1)); - /* INT +p INT - (PTR)(INT + INT). Stripping types allows for this. */ if (INTEGRAL_TYPE_P (TREE_TYPE (arg1)) INTEGRAL_TYPE_P (TREE_TYPE (arg0))) --- 10009,10014 *** fold_binary_loc (location_t loc, *** 10023,10041 fold_convert_loc (loc, sizetype, arg0))); - /* (PTR +p B) +p A - PTR +p (B + A) */ - if (TREE_CODE (arg0) == POINTER_PLUS_EXPR) - { - tree inner; - tree arg01 = fold_convert_loc (loc, sizetype, TREE_OPERAND (arg0, 1)); - tree arg00 = TREE_OPERAND (arg0, 0); - inner = fold_build2_loc (loc, PLUS_EXPR, sizetype, - arg01, fold_convert_loc (loc, sizetype, arg1)); - return fold_convert_loc (loc, type, - fold_build_pointer_plus_loc (loc, - arg00, inner)); - } - /* PTR_CST +p CST - CST1 */ if (TREE_CODE (arg0) == INTEGER_CST TREE_CODE (arg1) == INTEGER_CST) return fold_build2_loc (loc, PLUS_EXPR, type, arg0, --- 10019,10024 Index: trunk/gcc/match.pd === *** trunk.orig/gcc/match.pd 2014-11-06 10:46:25.850593551 +0100 --- trunk/gcc/match.pd 2014-11-07 09:44:16.460975860 +0100 *** along with GCC; see the file COPYING3. *** 39,44 --- 39,49 (op @0 integer_zerop) (non_lvalue @0))) + /* 0 +p index - (type)index */ + (simplify + (pointer_plus integer_zerop @1) + (non_lvalue (convert @1))) + /* Simplify x - x. This is unsafe for certain floats even in non-IEEE formats. In IEEE, it is unsafe because it does wrong for NaNs. *** along with GCC; see the file COPYING3. *** 228,246 TYPE_PRECISION (TREE_TYPE (@1)) == 1) (le @0 @1))) - /* From tree-ssa-forwprop.c:simplify_not_neg_expr. */ - /* ~~x - x */ (simplify (bit_not (bit_not @0)) @0) - /* The corresponding (negate (negate @0)) - @0 is in match-plusminus.pd. */ (simplify (negate (negate @0)) @0) /* Simplifications of conversions. */ /* Basic strip-useless-type-conversions / strip_nops. */ --- 233,282 TYPE_PRECISION (TREE_TYPE (@1)) == 1) (le @0 @1))) /* ~~x - x */ (simplify (bit_not (bit_not @0)) @0) (simplify (negate (negate @0)) @0) + /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ + (simplify + (pointer_plus (pointer_plus @0 @1) @3) + (pointer_plus @0 (plus @1 @3))) + + /* Pattern match + tem1 = (long) ptr1; + tem2 =
Re: [2/2][PATCH,ARM]Generate UAL assembly code for Thumb-1 target
hi, the ARM bootstrap seems to fail for libgcc2.c on the thumb multilib for libgcc2: muldi3 -mthumb -O2 -g /tmp/ccYrycUw.s: Assembler messages: /tmp/ccYrycUw.s:69: Error: MOV Rd, Rs with two low registers is not permitted on this architecture -- `mov r6,r7' preprocessed attached. Thanks Christian typedef int ptrdiff_t; typedef unsigned int size_t; typedef unsigned int wchar_t; typedef struct { long long __max_align_ll __attribute__((__aligned__(__alignof__(long long; long double __max_align_ld __attribute__((__aligned__(__alignof__(long double; } max_align_t; extern void *malloc (size_t); extern void free (void *); extern int atexit (void (*)(void)); extern void abort (void) __attribute__ ((__noreturn__)); extern size_t strlen (const char *); extern void *memcpy (void *, const void *, size_t); extern void *memset (void *, int, size_t); typedef unsigned int hashval_t; typedef hashval_t (*htab_hash) (const void *); typedef int (*htab_eq) (const void *, const void *); typedef void (*htab_del) (void *); typedef int (*htab_trav) (void **, void *); typedef void *(*htab_alloc) (size_t, size_t); typedef void (*htab_free) (void *); typedef void *(*htab_alloc_with_arg) (void *, size_t, size_t); typedef void (*htab_free_with_arg) (void *, void *); struct htab { htab_hash hash_f; htab_eq eq_f; htab_del del_f; void ** entries; size_t size; size_t n_elements; size_t n_deleted; unsigned int searches; unsigned int collisions; htab_alloc alloc_f; htab_free free_f; void * alloc_arg; htab_alloc_with_arg alloc_with_arg_f; htab_free_with_arg free_with_arg_f; unsigned int size_prime_index; }; typedef struct htab *htab_t; enum insert_option {NO_INSERT, INSERT}; extern htab_t htab_create_alloc (size_t, htab_hash, htab_eq, htab_del, htab_alloc, htab_free); extern htab_t htab_create_alloc_ex (size_t, htab_hash, htab_eq, htab_del, void *, htab_alloc_with_arg, htab_free_with_arg); extern htab_t htab_create_typed_alloc (size_t, htab_hash, htab_eq, htab_del, htab_alloc, htab_alloc, htab_free); extern htab_t htab_create (size_t, htab_hash, htab_eq, htab_del); extern htab_t htab_try_create (size_t, htab_hash, htab_eq, htab_del); extern void htab_set_functions_ex (htab_t, htab_hash, htab_eq, htab_del, void *, htab_alloc_with_arg, htab_free_with_arg); extern void htab_delete (htab_t); extern void htab_empty (htab_t); extern void * htab_find (htab_t, const void *); extern void ** htab_find_slot (htab_t, const void *, enum insert_option); extern void * htab_find_with_hash (htab_t, const void *, hashval_t); extern void ** htab_find_slot_with_hash (htab_t, const void *, hashval_t, enum insert_option); extern void htab_clear_slot (htab_t, void **); extern void htab_remove_elt (htab_t, void *); extern void htab_remove_elt_with_hash (htab_t, void *, hashval_t); extern void htab_traverse (htab_t, htab_trav, void *); extern void htab_traverse_noresize (htab_t, htab_trav, void *); extern size_t htab_size (htab_t); extern size_t htab_elements (htab_t); extern double htab_collisions (htab_t); extern htab_hash htab_hash_pointer; extern htab_eq htab_eq_pointer; extern hashval_t htab_hash_string (const void *); extern hashval_t iterative_hash (const void *, size_t, hashval_t); extern int filename_cmp (const char *s1, const char *s2); extern int filename_ncmp (const char *s1, const char *s2, size_t n); extern hashval_t filename_hash (const void *s); extern int filename_eq (const void *s1, const void *s2); struct _dont_use_rtx_here_; struct _dont_use_rtvec_here_; struct _dont_use_rtx_insn_here_; union _dont_use_tree_here_; enum function_class { function_c94, function_c99_misc, function_c99_math_complex, function_sincos, function_c11_misc }; enum memmodel { MEMMODEL_RELAXED = 0, MEMMODEL_CONSUME = 1, MEMMODEL_ACQUIRE = 2, MEMMODEL_RELEASE = 3, MEMMODEL_ACQ_REL = 4, MEMMODEL_SEQ_CST = 5, MEMMODEL_LAST = 6 }; typedef void (*gt_pointer_operator) (void *, void *); typedef unsigned char uchar; enum debug_info_type { NO_DEBUG, DBX_DEBUG, SDB_DEBUG, DWARF2_DEBUG, XCOFF_DEBUG, VMS_DEBUG, VMS_AND_DWARF2_DEBUG }; enum debug_info_levels { DINFO_LEVEL_NONE, DINFO_LEVEL_TERSE, DINFO_LEVEL_NORMAL, DINFO_LEVEL_VERBOSE }; enum debug_info_usage { DINFO_USAGE_DFN, DINFO_USAGE_DIR_USE, DINFO_USAGE_IND_USE, DINFO_USAGE_NUM_ENUMS }; enum debug_struct_file { DINFO_STRUCT_FILE_NONE, DINFO_STRUCT_FILE_BASE, DINFO_STRUCT_FILE_SYS, DINFO_STRUCT_FILE_ANY }; enum symbol_visibility { VISIBILITY_DEFAULT, VISIBILITY_PROTECTED, VISIBILITY_HIDDEN, VISIBILITY_INTERNAL }; enum ivar_visibility { IVAR_VISIBILITY_PRIVATE, IVAR_VISIBILITY_PROTECTED,
Re: [patch] Provide a can_compare_and_swap_p target hook.
On 06/11/14 19:05, Andrew MacLeod wrote: 1) Given that the compiler *always* provides support via libatomic now (even if it is via locks), does that mean that VMSupportsCS8_builtin() should always return true? or should we map to that a call to __atomic_always_lock_free() ? (that always gets folded to a true or false at compile time) my guess is the latter? Perhaps so. The problem is that some targets can't do CAS on 64-bit doublewords. 2) and in compareAndSwapLong_builtin(), thre is a wonky bit: /* We don't trust flag_use_atomic_builtins for multi-word compareAndSwap. Some machines such as ARM have atomic libfuncs but not the multi-word versions. */ if (can_compare_and_swap_p (mode, (flag_use_atomic_builtins GET_MODE_SIZE (mode) = UNITS_PER_WORD))) .. /* generate 8 byte CAS */ I gather we dont need to do anything special here anymore either? As an observation of inconsistency, compareAndSwapObject_builtin doesn't do that check before calling the 8 byte CAS : I believe that any machine which has 64-bit pointers and can do CAS can do a 64-bit CAS. I'm worried about 32-bit machines trying to do a 64-bit CAS. 3) And finally, is flag_use_atomic_builtins suppose to turn them off completely? Right now it is passed in to the second parameter of can_compare_and_swap_p, which really just says can we compare and swap without calling a libfunc.. so currently if the flag is 0, but there is native support, the call is generated anyway. should that condition really be: if (flag_use_atomic_builtins) { ... /* generate atomic call */ } I'm sorry, I really can't remember. I can't think of any reason to want to turn off builtin support. You have to remember that all this was written when our support for atomic builtins was seriously flaky and we would just punt back to the user anything we hadn't written yet. Andrew.
Re: [PATCH] PR 63721 IPA ICF cause atomic-comp-swap-release-acquire.c ICE
On 11/05/14 07:09, Jiong Wang wrote: the same ICE will happen on x86-64, if compile with -O2 -fPIC. the reason is for the following two functions, they are identical, so IPA-ICF pass try to transform the second function to call the first one directly. int atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (int a, int b) { return __atomic_compare_exchange (v, a, b, STRONG, __ATOMIC_RELEASE, __ATOMIC_ACQUIRE); } int atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b) { return __atomic_compare_exchange_n (v, a, b, STRONG, __ATOMIC_RELEASE, __ATOMIC_ACQUIRE); } while during this transformation, looks like there are something wrong with the function argument handling. take a for example, because later there are a, so it's marked as addressable. while after transformation, if we turn the second function into int atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b) { return atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (a, b) } then argument a is no longer addressable. so, in cgraph_node::release_body, when making the wrapper, except clearing the function body, we should also clear the addressable flag for function args because they are decided by the function body which is cleared. bootstrap ok on x86-64 and no regression. bootstrap ok on aarch64 juno. ICE gone away on arm x86-64 ok for trunk? gcc/ PR tree-optimization/63721 * cgraph.c (cgraph_node::release_body): Clear addressable flag for function args. While I understand the need to clear the addressable flag, I think release_body probably isn't the best place to do this. Seems to me that ought to happen when we emit the thunk or otherwise transform the body into something that doesn't take the address of those parameters. Yep, I would just move it into expand_thunk - the TREE_ADDRESSABLE bits are not really well defined before we build the gimple body. Honza jeff
RE: [2/2][PATCH,ARM]Generate UAL assembly code for Thumb-1 target
-Original Message- From: Christian Bruel [mailto:christian.br...@st.com] Sent: Friday, November 07, 2014 5:27 PM To: Terry Guo Cc: gcc-patches@gcc.gnu.org Subject: Re: [2/2][PATCH,ARM]Generate UAL assembly code for Thumb-1 target hi, the ARM bootstrap seems to fail for libgcc2.c on the thumb multilib for libgcc2: muldi3 -mthumb -O2 -g /tmp/ccYrycUw.s: Assembler messages: /tmp/ccYrycUw.s:69: Error: MOV Rd, Rs with two low registers is not permitted on this architecture -- `mov r6,r7' preprocessed attached. Thanks Christian Many thanks. I am looking into it now. BR, Terry
Re: [PATCH 10/11][RS6000] Migrate reduction optabs to reduc_..._scal
Ah I see now! Thank you for explaining that bit, I was a bit puzzled when I saw it, but it makes sense now! Cheers, Alan Bill Schmidt wrote: On Thu, 2014-11-06 at 16:44 +, Alan Lawrence wrote: Hmmm. I am a little surprised by your mention of saturation points as I would not expect any variety of reduc_plus to be a saturating operation??? I wouldn't either, but the underlying vsum4ubs and vsum4sbs instructions used in these patterns do both a reduction and an add to another value. If that other value is large enough this can trigger a saturation event. However, the patterns use vzero for this other value, so it's not possible to approach the saturation cutoff for either instruction since the reductions are being done on byte values. (Each word in the vector result is the sum of the corresponding four byte values in the vector source, added to the other value, which here is zero.) Thanks, Bill
Re: [PATCH, libfortran] PR 47007, 61847 Locale failures in libgfortran
On Thu, Nov 6, 2014 at 12:38 PM, Janne Blomqvist blomqvist.ja...@gmail.com wrote: On Wed, Nov 5, 2014 at 12:48 PM, Janne Blomqvist blomqvist.ja...@gmail.com wrote: Hi, the attached patch fixes a few locale related failures in libgfortran, in the case where the POSIX 2008 extended locale functionality and extensions strto{f,d,ld}_l are present. These failures typically occur when libgfortran is used from a program which has set the locale with setlocale(), and the locale uses a different decimal separator than the C locale. The patch fixes this by creating a C locale which is then used by strto{f,d,ld}_l, and also is installed as the per-thread locale when starting a formatted IO, then reset to the previous value when the IO is done. I have chosen to not fallback to calling setlocale() in case the POSIX 2008 locale stuff isn't available, as that could create nasty hard to debug race conditions in a multi-threaded program. (I think Jerry's proposed patch which checks the locale for the decimal separator is still useful as a fallback in case the POSIX 2008 locale stuff isn't available) Hi, updated patch attached. Since the patch sets the per-thread locale with uselocale, using the non-standard strto{f,d,ld}_l functions isn't necessary. When getting rid of this part of the original patch, I noticed a few failures due to the uselocale() calls being in the wrong places. These are fixed in the updated patch. Also Jakub's suggestion has been incorporated. Further, investigation revealed that some targets (Darwin and Freebsd) have the extended locale functionality in xlocale.h rather than locale.h as POSIX 2008 specifies. So check for that header. Finally, as we set the per-thread locale to C, we'd lose localized error messages. So the updated patch fixes this by updating the gf_strerror() function as well. Hi again! Well, for all my ranting against using setlocale() in a library potentially used by multi-threaded programs, here's a patch that does exactly that, as a fallback in case the POSIX 2008 per-thread locale stuff isn't available. So this can be seen as an alternative to the approach Jerry suggested in the patch attached in PR 61847. - Jerry's patch (use localeconv() to figure out the decimal separator) - Race condition if locale is set between OPEN (where the separator is checked) and READ/WRITE. - Potential breakage in weird locales where the decimal separator isn't . or ,? See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61847#c21 - Other potential issues with weird locales? E.g. are there locales which use grouping characters, e.g. 1e6 is 1,000,000.0? - My patch (use setlocale(LC_NUMERIC, C) when starting formatted I/O, change back when I/O statement is done.) - Race condition if locale is set concurrently in another thread while a formatted I/O is in progress. - Potential problem if another thread does something dependent on LC_NUMERIC while a formatted I/O is in progress in one thread. - Should be robust against weird locales. In both cases IMHO these approaches should be used only if the POSIX 2008 per-thread locale stuff isn't available. I have no strong opinions which is preferable, comments? Attached is locale3_top2.diff, which is on top of my previous patch, and locale3.diff which includes the previous patch and applies against trunk. Ok for trunk? 2014-11-07 Janne Blomqvist j...@gcc.gnu.org PR libfortran/47007 PR libfortran/61847 * config.h.in: Regenerated. * configure: Regenerated. * configure.ac (AC_CHECK_HEADERS_ONCE): Check for xlocale.h. (AC_CHECK_FUNCS_ONCE): Check for newlocale, freelocale, uselocale, strerror_l. * io/io.h (locale.h): Include. (xlocale.h): Include if present. (c_locale): New variable. (st_parameter_dt): Add old_locale member. * io/transfer.c (data_transfer_init): Set locale to C if doing formatted transfer. (finalize_transfer): Reset locale to previous. * io/unit.c (c_locale): New variable. (init_units): Init c_locale. (close_units): Free c_locale. * runtime/error.c (locale.h): Include. (xlocale.h): Include if present. (gf_strerror): Use strerror_l if available. Reset locale to LC_GLOBAL_LOCALE for strerror_r branch. 2014-11-07 Janne Blomqvist j...@gcc.gnu.org PR libfortran/47007 PR libfortran/61847 * gfortran.texi: Add note about locale issues to thread-safety section. -- Janne Blomqvist diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi index 41d6559..0d19e7a 100644 --- a/gcc/fortran/gfortran.texi +++ b/gcc/fortran/gfortran.texi @@ -1223,10 +1223,26 @@ implemented with the @code{system} function, which need not be thread-safe. It is the responsibility of the user to ensure that @code{system} is not called concurrently. -Finally, for platforms not supporting thread-safe POSIX functions, -further functionality might not be thread-safe. For details, please -consult the documentation
[PATCH] PR63676, exit tree fold when node be TREE_CLOBBER_P
the problem is caused by constant fold of node with TREE_CLOBBER_P be true. according to rtl expander, the purpose of clobber is to mark the going out of scope. if (TREE_CLOBBER_P (rhs)) /* This is a clobber to mark the going out of scope for this LHS. */ for vshuf-v16hi, there will be such node bb 5: r ={v} {CLOBBER}; while the new added fold_all_stmts since r216728 will invoke generic fold and that function in fold-const.c has a bug when folding CONSTRUCTOR. we should not do fold if the tree node is also with TREE_THIS_VOLATILE (t) be true, otherwise we will generate extra insn during expand. for example, above assignment will be transformed into r = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; while OImode immediate move is not supported when -mcpu=cortex-a9 -mfloat-abi=softfp -mfpu=neon specified, thus trigger insn_invalid_p error for this testcase. bootstrap ok on x86-64, no regression. ICE on arm gone away. ok to trunk? gcc/ PR tree/63676 fold-const.c (fold): Do not fold node when TREE_CLOBBER_P be true. diff --git a/gcc/fold-const.c b/gcc/fold-const.c index efcefa7..006fb70 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -14318,6 +14318,10 @@ fold (tree expr) if (kind == tcc_constant) return t; + /* Return right away if a TREE_CLOBBER node. */ + if (TREE_CLOBBER_P (t)) +return t; + /* CALL_EXPR-like objects with variable numbers of operands are treated specially. */ if (kind == tcc_vl_exp)
[PATCH] Fix for ipa/63595
Hello. Following patch fixes PR/63595, where IPA ICF creates a thunk that passes argument by reference. Patch can bootstrap x86_64-linux and there's no new regression introduced. Patch was preapproved by Honza. Thanks, Martin gcc/testsuite/ChangeLog: 2014-11-07 Martin Liska mli...@suse.cz PR ipa/63595 * g++.dg/ipa/pr63595.C: New test. gcc/ChangeLog: 2014-11-07 Martin Liska mli...@suse.cz PR ipa/63595 * g++.dg/ipa/pr63595.C: New test. * cgraphunit.c (cgraph_node::expand_thunk): DECL_BY_REFERENCE is correctly handled for thunks created by IPA ICF. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 6f61f5c..2e4af6a 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -1546,7 +1546,15 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) if (!VOID_TYPE_P (restype)) { if (DECL_BY_REFERENCE (resdecl)) - restmp = gimple_fold_indirect_ref (resdecl); + { + restmp = gimple_fold_indirect_ref (resdecl); + if (!restmp) + restmp = build2 (MEM_REF, + TREE_TYPE (TREE_TYPE (DECL_RESULT (alias))), + resdecl, + build_int_cst (TREE_TYPE + (DECL_RESULT (alias)), 0)); + } else if (!is_gimple_reg_type (restype)) { restmp = resdecl; @@ -1641,7 +1649,11 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool force_gimple_thunk) gimple_call_set_tail (call, true); /* Build return value. */ - ret = gimple_build_return (restmp); + if (!DECL_BY_REFERENCE (resdecl)) + ret = gimple_build_return (restmp); + else + ret = gimple_build_return (resdecl); + gsi_insert_after (bsi, ret, GSI_NEW_STMT); } else diff --git a/gcc/testsuite/g++.dg/ipa/pr63595.C b/gcc/testsuite/g++.dg/ipa/pr63595.C new file mode 100644 index 000..30e9303 --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr63595.C @@ -0,0 +1,80 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-ipa-icf-details } */ + +template int dim class B; +template int, int dim class TriaObjectAccessor; +template int, typename Accessor class A; +template int dim class TriaDimensionInfo { +public: + typedef A3, TriaObjectAccessor2, 3 raw_quad_iterator; + typedef A3, B3 raw_hex_iterator; + typedef raw_hex_iterator raw_cell_iterator; +}; +template int dim class Triangulation : public TriaDimensionInfo1 { + public: + typedef typename TriaDimensionInfodim::raw_quad_iterator raw_quad_iterator; + TriaDimensionInfo::raw_cell_iterator end() const; + raw_quad_iterator end_quad() const { +return raw_quad_iterator(const_castTriangulation *(this), 0, 0); + } +}; +template int dim class TriaAccessor { +public: + typedef void AccessorData; + TriaAccessor(const Triangulationdim * = 0); + Triangulation1 *tria; + + int a, b, c; +}; +template int dim class TriaObjectAccessor2, dim : public TriaAccessordim { +public: + typedef typename TriaAccessordim::AccessorData AccessorData; + TriaObjectAccessor(const Triangulationdim * = 0); +}; +template int dim class TriaObjectAccessor3, dim : public TriaAccessordim { +public: + typedef typename TriaAccessordim::AccessorData AccessorData; + TriaObjectAccessor(const Triangulationdim * = 0); +}; +template int dim class B : public TriaObjectAccessordim, dim { +public: + typedef typename TriaObjectAccessordim, dim::AccessorData AccessorData; + B(const Triangulationdim * = 0); +}; +template int dim, typename Accessor class A { +public: + A(const A ); + A(const Triangulationdim *, int, int); + Accessor accessor; +}; +template class Triangulation3; +template int dim, typename Accessor +Adim, Accessor::A(const Triangulationdim *, int, int) {} +template int dim +TriaAccessordim::TriaAccessor(const Triangulationdim *) +: tria(), a(-1), b(-2), c(-3) {} +template int dim +TriaObjectAccessor2, dim::TriaObjectAccessor(const Triangulationdim *) {} +template int dim +TriaObjectAccessor3, dim::TriaObjectAccessor(const Triangulationdim *) {} +template int dim Bdim::B(const Triangulationdim *) {} +template +TriaDimensionInfo3::raw_cell_iterator Triangulation3::end() const { + return raw_hex_iterator(const_castTriangulation *(this), 0, 0); +} + +#pragma GCC optimize (-O0) +int main() +{ + Triangulation 3 t; + Triangulation3::raw_quad_iterator i1 = t.end_quad(); + TriaDimensionInfo3::raw_cell_iterator i2 = t.end(); + + if(i2.accessor.c != -3) +return 1; + + return 0; +} + +/* { dg-final { scan-ipa-dump Equal symbols: 1 icf } } */ +/* { dg-final { cleanup-ipa-dump icf } } */
Re: [PATCH] PR63676, exit tree fold when node be TREE_CLOBBER_P
On Fri, Nov 7, 2014 at 11:22 AM, Jiong Wang jiong.w...@arm.com wrote: the problem is caused by constant fold of node with TREE_CLOBBER_P be true. according to rtl expander, the purpose of clobber is to mark the going out of scope. if (TREE_CLOBBER_P (rhs)) /* This is a clobber to mark the going out of scope for this LHS. */ for vshuf-v16hi, there will be such node bb 5: r ={v} {CLOBBER}; while the new added fold_all_stmts since r216728 will invoke generic fold and that function in fold-const.c has a bug when folding CONSTRUCTOR. we should not do fold if the tree node is also with TREE_THIS_VOLATILE (t) be true, otherwise we will generate extra insn during expand. for example, above assignment will be transformed into r = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; while OImode immediate move is not supported when -mcpu=cortex-a9 -mfloat-abi=softfp -mfpu=neon specified, thus trigger insn_invalid_p error for this testcase. bootstrap ok on x86-64, no regression. ICE on arm gone away. ok to trunk? Please instead guard the GIMPLE_SINGLE_RHS case in fold_gimple_assign instead, like Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 217213) +++ gcc/gimple-fold.c (working copy) @@ -320,6 +320,9 @@ { tree rhs = gimple_assign_rhs1 (stmt); + if (TREE_CLOBBER_P (rhs)) + return NULL_TREE; + if (REFERENCE_CLASS_P (rhs)) return maybe_fold_reference (rhs, false); ok with that change. If you like you can guard fold () as well, but please inside the case CONSTRUCTOR: case only. Thanks, Richard. gcc/ PR tree/63676 fold-const.c (fold): Do not fold node when TREE_CLOBBER_P be true.
Re: [arm][patch] fix arm_neon_ok check on !arm_arch7
if armv6 never co-exist with NEON, personally I think your original patch is better because TARGET_NEON generally will be used when all options are processed. any way, this needs gate keeper's approval. Ping, Richard. Andrew
[PATCH] Fix PR63770
The following fixes recursion between a pattern in fold-const.c and one formerly in tree-ssa-forwprop.c but now in match.pd (and thus also fold-const.c). There are two conflicting transforms and the measure against recursion I put in place doesn't help for the testcase (as the conversion is useless). Bootstrapped and tested on x86_64-unknown-linux-gnu. Richard. 2014-11-07 Richard Biener rguent...@suse.de PR middle-end/63770 * match.pd: Guard conflicting GENERIC pattern properly. * gcc.dg/pr63770.c: New testcase. Index: gcc/match.pd === --- gcc/match.pd2014-11-07 09:24:45.943027082 +0100 +++ gcc/match.pd2014-11-07 09:23:06.573031431 +0100 @@ -129,14 +129,15 @@ along with GCC; see the file COPYING3. (bitop (convert @0) (convert? @1)) (if (((TREE_CODE (@1) == INTEGER_CST INTEGRAL_TYPE_P (TREE_TYPE (@0)) - int_fits_type_p (@1, TREE_TYPE (@0)) -/* ??? This transform conflicts with fold-const.c doing - Convert (T)(x c) into (T)x (T)c, if c is an integer - constants (if x has signed type, the sign bit cannot be set - in c). This folds extension into the BIT_AND_EXPR. - Restrict it to GIMPLE to avoid endless recursions. */ - (bitop != BIT_AND_EXPR || GIMPLE)) - || types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))) + int_fits_type_p (@1, TREE_TYPE (@0))) + || (GIMPLE types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))) + || (GENERIC TREE_TYPE (@0) == TREE_TYPE (@1))) + /* ??? This transform conflicts with fold-const.c doing + Convert (T)(x c) into (T)x (T)c, if c is an integer + constants (if x has signed type, the sign bit cannot be set + in c). This folds extension into the BIT_AND_EXPR. + Restrict it to GIMPLE to avoid endless recursions. */ +(bitop != BIT_AND_EXPR || GIMPLE) (/* That's a good idea if the conversion widens the operand, thus after hoisting the conversion the operation will be narrower. */ TYPE_PRECISION (TREE_TYPE (@0)) TYPE_PRECISION (type) Index: gcc/testsuite/gcc.dg/pr63770.c === --- gcc/testsuite/gcc.dg/pr63770.c (revision 0) +++ gcc/testsuite/gcc.dg/pr63770.c (working copy) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ + +char a; + +struct S +{ + int f0:9; +}; + +volatile struct S b; + +int +fn1 () +{ + return (1 b.f0) a; +}
Re: [PATCH] PR63676, exit tree fold when node be TREE_CLOBBER_P
On 07/11/14 10:35, Richard Biener wrote: On Fri, Nov 7, 2014 at 11:22 AM, Jiong Wang jiong.w...@arm.com wrote: the problem is caused by constant fold of node with TREE_CLOBBER_P be true. according to rtl expander, the purpose of clobber is to mark the going out of scope. if (TREE_CLOBBER_P (rhs)) /* This is a clobber to mark the going out of scope for this LHS. */ for vshuf-v16hi, there will be such node bb 5: r ={v} {CLOBBER}; while the new added fold_all_stmts since r216728 will invoke generic fold and that function in fold-const.c has a bug when folding CONSTRUCTOR. we should not do fold if the tree node is also with TREE_THIS_VOLATILE (t) be true, otherwise we will generate extra insn during expand. for example, above assignment will be transformed into r = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; while OImode immediate move is not supported when -mcpu=cortex-a9 -mfloat-abi=softfp -mfpu=neon specified, thus trigger insn_invalid_p error for this testcase. bootstrap ok on x86-64, no regression. ICE on arm gone away. ok to trunk? Please instead guard the GIMPLE_SINGLE_RHS case in fold_gimple_assign instead, like Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 217213) +++ gcc/gimple-fold.c (working copy) @@ -320,6 +320,9 @@ { tree rhs = gimple_assign_rhs1 (stmt); + if (TREE_CLOBBER_P (rhs)) + return NULL_TREE; + if (REFERENCE_CLASS_P (rhs)) return maybe_fold_reference (rhs, false); ok with that change. If you like you can guard fold () as well, but please inside the case CONSTRUCTOR: case only. But TREE_CLOBBER_P () checks for CONSTRUCTOR anyway. Does it have a chance of being folded between the start of fold () and the case CONSTRUCTOR: given its a CLOBBER? What am I missing? Thanks, Tejas.
[wwwdocs] Remove cgi-bin/cvsweb.conf
In September cvsweb was removed due to a security issue; this now also removes its config file. Gerald Index: cgi-bin/cvsweb.conf === RCS file: cgi-bin/cvsweb.conf diff -N cgi-bin/cvsweb.conf --- cgi-bin/cvsweb.conf 9 Jul 2014 14:53:10 - 1.8 +++ /dev/null 1 Jan 1970 00:00:00 - @@ -1,306 +0,0 @@ -# -*-perl-*- -# Configuration of cvsweb.cgi, the -# CGI interface to CVS Repositories. -# -# (c) 1998-1999 H. Zellerzel...@think.de -# 1999 H. Nordstr? h...@hem.passagen.se -# based on work by Bill Fenner fen...@freebsd.org -# $Id: cvsweb.conf,v 1.8 2014/07/09 14:53:10 gerald Exp $ -# -### - -## -# CVS Root -## -# CVSweb can handle several CVS-Repositories -# at once. Enter a short symbolic names and the -# full path of these repositories here. -# NOTE that the symbolic names may not contain -# whitespaces. -# Note, that cvsweb.cgi currently needs to have physical access -# to the CVS repository so :pserver:some...@xyz.com:/data/cvsroot -# won't work! - -# 'symbolic_name' 'path_to_the_actual_repository' -%CVSROOT = ( -'gcc' = '/cvs/gcc', - ); - -# This tree is enabled by default when -# you enter the page -$cvstreedefault = 'gcc'; - -## -# Defaults for UserSettings -## -%DEFAULTVALUE = ( - # sortby: File sort order - # file Sort by filename - # revSort by revision number - # date Sort by commit date - # author Sort by author - # logSort by log message - - sortby = file, - - # hideattic: Hide or show files in Attic - # 1 Hide files in Attic - # 0 Show files in Attic - - hideattic = 1, - - # logsort: Sort order for CVS logs - # date Sort revisions by date - # revSort revision by revision number - # cvsDon't sort them. Same order as CVS/RCS shows them. - - logsort = date, - - # f: Default diff format - # h Human readable - # u Unified diff - # c Context diff - # s Side by side - f = u, -); - -## -# some layout stuff -## - -# color settings in the body-tag -$body_tag = 'body text=#00 bgcolor=#ff'; - -# Wanna have a logo on the page ? -$logo = ''; - -# The title of the Page on startup -$defaulttitle = GCC CVS Repository; - -# The address is shown on the footer -$address = gcc\@gcc.gnu.org; - -# Default page background color for the diffs -# and annotations -$backcolor = #EE; - -# color of navigation Header for -# diffs and annotations -$navigationHeaderColor='#EE'; - -$long_intro = -This is a WWW interface to CVS repositories on the -a href=\https://gcc.gnu.org/\;codegcc.gnu.org/code/a -web site. -p -If you would like to use this CGI script on your own web server and -CVS tree, see Zellers's -A HREF=\http://linux.fh-heilbronn.de/~zeller/cgi/cvsweb.cgi\; -CVSweb distribution site/A. Bill's original script can be found -A HREF=\http://www.freebsd.org/~fenner/cvsweb/\;here/a. -p -Please send any suggestions, comments, etc. to -A HREF=\mailto:fenner\@freebsd.org\;Bill Fenner/A or, regarding the -modifications, to -A HREF=\mailto:zeller\@think.de\;Henner Zeller/A or -A HREF=\mailto:hno\@hem.passagen.se\;Henrik Nordstr?/A -; - -$short_instruction = -Click on a directory to enter that directory. Click on a file to display -its revision history and to get a chance to display diffs between revisions. -; - -# used icons; if icon-url is empty, the text representation is used; if -# you do not want to have an ugly tooltip for the icon, remove the -# text-representation. -# The width and height of the icon allow the browser to correcly display -# the table while still loading the icons. -# These default icons are coming with apache. -# If these icons are too large, check out the miniicons in the -# icons/ directory; they have a width/height of 16/16 -# format: TEXT ICON-URL width height -%ICONS = ( - back = [ ([BACK], /icons/back.gif, 20, 22) ], - dir = [ ([DIR], /icons/dir.gif, 20, 22) ], - file = [ ([TXT], /icons/text.gif, 20, 22) ], - ); - -# the length to which the last logentry should -# be truncated when shown in the directory view -$shortLogLen = 80; - -# Show author of last change -$show_author=1; - -## -# table view for directories -## - -# Show directory as table -# this is much more readable but has one -# drawback: the whole table has to be loaded -# before common browsers display it which may -# be annoying if you have a slow link - and a -# large directory .. -$dirtable=1; - -# show different colors for even/odd rows -@tabcolors=('#EE', '#FF'); -$tablepadding=2; - -# Color of Header -$columnHeaderColorDefault='#CC'; -$columnHeaderColorSorted ='#88FF88'; - -# -# If you want to have colored
[COMMITTED][PATCH] PR63676, exit tree fold when node be TREE_CLOBBER_P
On 07/11/14 10:57, Tejas Belagod wrote: On 07/11/14 10:35, Richard Biener wrote: On Fri, Nov 7, 2014 at 11:22 AM, Jiong Wang jiong.w...@arm.com wrote: ok to trunk? Please instead guard the GIMPLE_SINGLE_RHS case in fold_gimple_assign instead, like Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 217213) +++ gcc/gimple-fold.c (working copy) @@ -320,6 +320,9 @@ { tree rhs = gimple_assign_rhs1 (stmt); + if (TREE_CLOBBER_P (rhs)) + return NULL_TREE; + if (REFERENCE_CLASS_P (rhs)) return maybe_fold_reference (rhs, false); ok with that change. If you like you can guard fold () as well, but please inside the case CONSTRUCTOR: case only. But TREE_CLOBBER_P () checks for CONSTRUCTOR anyway. Does it have a chance of being folded between the start of fold () and the case CONSTRUCTOR: given its a CLOBBER? What am I missing? Richard, Tejas, thanks for your comments. I committed below patch after re-bootstrap OK and no regression on x86-64. Have not touch fold (). I was putting the check at the start of fold () with the thoughts that we exit the fold as early as possible to be more efficient. as this bug do trigger from fold_gimple_assign, it will be more efficient to fix here. Regards, Jiong Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 217214) +++ gcc/gimple-fold.c (working copy) @@ -320,6 +320,9 @@ { tree rhs = gimple_assign_rhs1 (stmt); + if (TREE_CLOBBER_P (rhs)) + return NULL_TREE; + if (REFERENCE_CLASS_P (rhs)) return maybe_fold_reference (rhs, false); Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 217214) +++ gcc/ChangeLog (working copy) @@ -1,5 +1,12 @@ +2014-11-07 Jiong Wang jiong.w...@arm.com 2014-11-07 Richard Biener rguent...@suse.de + PR tree-optimization/63676 + * gimple-fold.c (fold_gimple_assign): Do not fold node when + TREE_CLOBBER_P be true. + +2014-11-07 Richard Biener rguent...@suse.de + PR middle-end/63770 * match.pd: Guard conflicting GENERIC pattern properly. Thanks, Tejas.
[PATCH][RFC] Report pass we are ICEing in
The following patch is a lazy attempt at reporting current_pass when ICEing as for example the backtraces are not exactly helpful in locating the problem when the ICE occurs during the verification phase. Tested with a forced ICE which now looks like: t.c: In function ?fn1?: cc1: note: executing pass `ssa' t.c:14:1: internal compiler error: in verify_ssa, at tree-ssa.c:939 we can't emit the note after the ICE (because the emitting a DK_ICE diagnostic will already terminate the compiler). Not otherwise tested. Comments? (of course this is to shift the blame faster without having to reproduce a bug...) Thanks, Richard. 2014-11-07 Richard Biener rguent...@suse.de * diagnostic.c: Include tree-pass.h. (internal_error): Report pass we are executing currently. Index: gcc/diagnostic.c === --- gcc/diagnostic.c(revision 217214) +++ gcc/diagnostic.c(working copy) @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. #include backtrace.h #include diagnostic.h #include diagnostic-color.h +#include tree-pass.h // for current_pass #include new // For placement new. @@ -1173,6 +1174,9 @@ internal_error (const char *gmsgid, ...) diagnostic_info diagnostic; va_list ap; + if (current_pass current_pass-name) +inform (UNKNOWN_LOCATION, executing pass `%s', current_pass-name); + va_start (ap, gmsgid); diagnostic_set_info (diagnostic, gmsgid, ap, input_location, DK_ICE); report_diagnostic (diagnostic);
[PATCH] Fix for ipa/63747
Hello. Following patch introduces LOW/HIGH checking in IPA ICF. Patch can bootstrap on x86_64-linux and regression has been introduced. The patch was pre-approved by Honza. Thanks, Martin gcc/testsuite/ChangeLog: 2014-11-07 Martin Liska mli...@suse.cz PR ipa/63747 * gcc.dg/ipa/pr63747.c: New test. gcc/ChangeLog: 2014-11-07 Martin Liska mli...@suse.cz PR ipa/63747 * ipa-icf-gimple.c (func_checker::compare_gimple_switch): Missing checking for CASE_LOW and CASE_HIGH added. diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c index ecb9667..75b5cfb 100644 --- a/gcc/ipa-icf-gimple.c +++ b/gcc/ipa-icf-gimple.c @@ -798,6 +798,19 @@ func_checker::compare_gimple_switch (gimple g1, gimple g2) tree label1 = gimple_switch_label (g1, i); tree label2 = gimple_switch_label (g2, i); + /* Label LOW and HIGH comparison. */ + tree low1 = CASE_LOW (label1); + tree low2 = CASE_LOW (label2); + + if (!tree_int_cst_equal (low1, low2)) + return return_false_with_msg (case low values are different); + + tree high1 = CASE_HIGH (label1); + tree high2 = CASE_HIGH (label2); + + if (!tree_int_cst_equal (high1, high2)) + return return_false_with_msg (case high values are different); + if (TREE_CODE (label1) == CASE_LABEL_EXPR TREE_CODE (label2) == CASE_LABEL_EXPR) { diff --git a/gcc/testsuite/gcc.dg/ipa/pr63747.c b/gcc/testsuite/gcc.dg/ipa/pr63747.c new file mode 100644 index 000..7b5df4b --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr63747.c @@ -0,0 +1,40 @@ +/* { dg-options -O2 -fdump-ipa-icf } */ +/* { dg-do run } */ + +static int __attribute__((noinline)) +foo(int i) +{ + switch (i) + { +case 0: +case 1: +case 2: +case 3: + return 0; +default: + return 1; + } +} + +static int __attribute__((noinline)) +bar(int i) +{ + switch (i) + { +case 4: +case 5: +case 6: +case 7: + return 0; +default: + return 1; + } +} + +int main() +{ + return foo(0) + bar(4); +} + +/* { dg-final { scan-ipa-dump Equal symbols: 0 icf } } */ +/* { dg-final { cleanup-ipa-dump icf } } */
[Patch AArch64] Fix PR 63724 - Improve immediate generation
Hi, This patch fixes up immediate generation for the AArch64 backend allowing for the RTL optimizers like CSE and loop hoisting to work more effectively with immediates. I also took the oppurtunity to rework this to also be used in the costs calculations. This patch only deals with numerical constants and handling symbolic constants will be the subject of another patch. There has been some talk about restructuring the immediate generation with a trie but I'm going to leave that for another time. This requires another patch that James is working on to fix up bsl code code generation with float mode which was discovered to be broken causing regressions when testing this code. I've worked around those failures by pulling out a bsl patch that restructures the floating point versions with an unspec and found no other issues with this patch. The output now generated matches the expected output in the PR. Looked at output in a number of other benchmarks and saw it making sense. Tested cross with aarch64-none-elf + a BSL patch with no regressions. Bootstrapped and regression tested with aarch64-none-linux-gnu. Ok for trunk once James's BSL patch is committed ? Ramana DATE Ramana Radhakrishnan ramana.radhakrish...@arm.com PR target/63724 * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Split out numerical immediate handling to... (aarch64_internal_mov_immediate): ...this. New. (aarch64_rtx_costs): Use aarch64_internal_mov_immediate. (aarch64_mov_operand_p): Relax predicate. * config/aarch64/aarch64.md (movmode:GPI): Do not expand CONST_INTs. (*movsi_aarch64): Turn into define_insn_and_split and new alternative for 'n'. (*movdi_aarch64): Likewise. commit 34392753bd7f1481eff6ff86e055981618a3d06e Author: Ramana Radhakrishnan ramana.radhakrish...@arm.com Date: Thu Nov 6 16:08:27 2014 + diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 736ad90..20cbb2d 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1046,8 +1046,8 @@ aarch64_add_offset (machine_mode mode, rtx temp, rtx reg, HOST_WIDE_INT offset) return plus_constant (mode, reg, offset); } -void -aarch64_expand_mov_immediate (rtx dest, rtx imm) +static int +aarch64_internal_mov_immediate (rtx dest, rtx imm, bool generate) { machine_mode mode = GET_MODE (dest); unsigned HOST_WIDE_INT mask; @@ -1057,85 +1057,14 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) bool subtargets; rtx subtarget; int one_match, zero_match, first_not__match; - - gcc_assert (mode == SImode || mode == DImode); - - /* Check on what type of symbol it is. */ - if (GET_CODE (imm) == SYMBOL_REF - || GET_CODE (imm) == LABEL_REF - || GET_CODE (imm) == CONST) -{ - rtx mem, base, offset; - enum aarch64_symbol_type sty; - - /* If we have (const (plus symbol offset)), separate out the offset -before we start classifying the symbol. */ - split_const (imm, base, offset); - - sty = aarch64_classify_symbol (base, SYMBOL_CONTEXT_ADR); - switch (sty) - { - case SYMBOL_FORCE_TO_MEM: - if (offset != const0_rtx - targetm.cannot_force_const_mem (mode, imm)) - { - gcc_assert (can_create_pseudo_p ()); - base = aarch64_force_temporary (mode, dest, base); - base = aarch64_add_offset (mode, NULL, base, INTVAL (offset)); - aarch64_emit_move (dest, base); - return; - } - mem = force_const_mem (ptr_mode, imm); - gcc_assert (mem); - if (mode != ptr_mode) - mem = gen_rtx_ZERO_EXTEND (mode, mem); - emit_insn (gen_rtx_SET (VOIDmode, dest, mem)); - return; - -case SYMBOL_SMALL_TLSGD: -case SYMBOL_SMALL_TLSDESC: -case SYMBOL_SMALL_GOTTPREL: - case SYMBOL_SMALL_GOT: - case SYMBOL_TINY_GOT: - if (offset != const0_rtx) - { - gcc_assert(can_create_pseudo_p ()); - base = aarch64_force_temporary (mode, dest, base); - base = aarch64_add_offset (mode, NULL, base, INTVAL (offset)); - aarch64_emit_move (dest, base); - return; - } - /* FALLTHRU */ - -case SYMBOL_SMALL_TPREL: - case SYMBOL_SMALL_ABSOLUTE: - case SYMBOL_TINY_ABSOLUTE: - aarch64_load_symref_appropriately (dest, imm, sty); - return; - - default: - gcc_unreachable (); - } -} + int num_insns = 0; if (CONST_INT_P (imm) aarch64_move_imm (INTVAL (imm), mode)) { - emit_insn (gen_rtx_SET (VOIDmode, dest, imm)); - return; -} - - if (!CONST_INT_P (imm)) -{ - if (GET_CODE (imm) == HIGH) + if (generate) emit_insn (gen_rtx_SET (VOIDmode, dest, imm)); - else -{ - rtx mem = force_const_mem (mode, imm); -
Re: [ARM] RFA: Use new rtl iterators in arm_find_sub_rtx_with_code
On 05/11/14 11:49, Richard Sandiford wrote: I think these functions only want to iterate over instruction patterns rather than whole instructions (which would include things like REG_EQUAL notes), since only the patterns are relevant for finding dependencies. There's then no need to check for null rtxes. Tested by making sure there were no code changes for gcc.dg, gcc.c-torture and g++.dg for plain arm-linux-gnueabi and aarch64-linux-gnu. Ramana also asked me to try -mcpu=cortex-a7, -mcpu=cortex-a9, -mcpu=arm9tdmi and -mcpu=cortex-a15. There were differences in: gcc.c-torture/execute/20060110-2.c gcc.c-torture/execute/ashrdi-1.c and gcc.dg/tree-ssa/pr24627.c for -mcpu=cortex-a7 and no differences for the other combinations. The A7 differences were due to the way that arm_get_set_operands handles multi-set instructions such as: (set (reg:CC_C 100 cc) (compare:CC_C (plus:SI (reg:SI 8 r8 [orig:121 a ] [121]) (reg:SI 0 r0 [orig:122 b ] [122])) (reg:SI 8 r8 [orig:121 a ] [121]))) (set (reg:SI 2 r2 [orig:120 D.4117 ] [120]) (plus:SI (reg:SI 8 r8 [orig:121 a ] [121]) (reg:SI 0 r0 [orig:122 b ] [122]))) for_each_rtx iterates over the subrtxes in forward order, so arm_get_set_operands would pick the set of CC. The new iterator pushes the contents of a PARALLEL onto a stack and pulls them in reverse order, so arm_get_set_operands would pick the set of r2. This means that after the patch the code sees a producer/consumer relationship that it previously missed. I think the new behaviour is what was intended. This code shouldn't really be relying on a particular iteration order though. There's a dependency if any SET in the potential producer sets a register used by the potential consumer. I think any fix for that should be done separately from the iterator rewrite. OK to install? Thanks, Richard gcc/ * config/arm/aarch-common.c: Include rtl-iter.h. (search_term, arm_find_sub_rtx_with_search_term): Delete. (arm_find_sub_rtx_with_code): Use FOR_EACH_SUBRTX_VAR. (arm_get_set_operands): Pass the insn pattern rather than the insn itself. (arm_no_early_store_addr_dep): Likewise. OK. R. Index: gcc/config/arm/aarch-common.c === --- gcc/config/arm/aarch-common.c 2014-10-25 09:42:00.631168827 +0100 +++ gcc/config/arm/aarch-common.c 2014-10-25 09:51:24.212872553 +0100 @@ -30,6 +30,7 @@ #include tree.h #include c-family/c-common.h #include rtl.h +#include rtl-iter.h /* In ARMv8-A there's a general expectation that AESE/AESMC and AESD/AESIMC sequences of the form: @@ -68,13 +69,6 @@ aarch_crypto_can_dual_issue (rtx_insn *p return 0; } -typedef struct -{ - rtx_code search_code; - rtx search_result; - bool find_any_shift; -} search_term; - /* Return TRUE if X is either an arithmetic shift left, or is a multiplication by a power of two. */ bool @@ -96,68 +90,32 @@ static rtx_code shift_rtx_codes[] = { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT, ROTATERT, ZERO_EXTEND, SIGN_EXTEND }; -/* Callback function for arm_find_sub_rtx_with_code. - DATA is safe to treat as a SEARCH_TERM, ST. This will - hold a SEARCH_CODE. PATTERN is checked to see if it is an - RTX with that code. If it is, write SEARCH_RESULT in ST - and return 1. Otherwise, or if we have been passed a NULL_RTX - return 0. If ST.FIND_ANY_SHIFT then we are interested in - anything which can reasonably be described as a SHIFT RTX. */ -static int -arm_find_sub_rtx_with_search_term (rtx *pattern, void *data) -{ - search_term *st = (search_term *) data; - rtx_code pattern_code; - int found = 0; - - gcc_assert (pattern); - gcc_assert (st); - - /* Poorly formed patterns can really ruin our day. */ - if (*pattern == NULL_RTX) -return 0; - - pattern_code = GET_CODE (*pattern); - - if (st-find_any_shift) -{ - unsigned i = 0; - - /* Left shifts might have been canonicalized to a MULT of some - power of two. Make sure we catch them. */ - if (arm_rtx_shift_left_p (*pattern)) - found = 1; - else - for (i = 0; i ARRAY_SIZE (shift_rtx_codes); i++) - if (pattern_code == shift_rtx_codes[i]) - found = 1; -} - - if (pattern_code == st-search_code) -found = 1; - - if (found) -st-search_result = *pattern; - - return found; -} - -/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE. */ +/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE. + If FIND_ANY_SHIFT then we are interested in anything which can + reasonably be described as a SHIFT RTX. */ static rtx arm_find_sub_rtx_with_code (rtx pattern, rtx_code code, bool find_any_shift) { - search_term st; -
Re: [ARM] RFA: Use new rtl iterators in arm_tls_referenced_p
On 05/11/14 11:51, Richard Sandiford wrote: Tested in the same way as the aarch-common.c patch. OK to install? Thanks, Richard gcc/ * config/arm/arm.c: Include rtl-iter.h. (arm_tls_referenced_p_1): Delete. (arm_tls_referenced_p): Use FOR_EACH_SUBRTX. OK. R. Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c 2014-11-02 19:59:27.588237213 + +++ gcc/config/arm/arm.c 2014-11-05 11:48:55.030053470 + @@ -82,6 +82,7 @@ #include gimple-expr.h #include builtins.h #include tm-constrs.h +#include rtl-iter.h /* Forward definitions of types. */ typedef struct minipool_nodeMnode; @@ -8078,25 +8079,6 @@ thumb_legitimize_reload_address (rtx *x_ return NULL; } -/* Test for various thread-local symbols. */ - -/* Helper for arm_tls_referenced_p. */ - -static int -arm_tls_operand_p_1 (rtx *x, void *data ATTRIBUTE_UNUSED) -{ - if (GET_CODE (*x) == SYMBOL_REF) -return SYMBOL_REF_TLS_MODEL (*x) != 0; - - /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are - TLS offsets, not real symbol references. */ - if (GET_CODE (*x) == UNSPEC - XINT (*x, 1) == UNSPEC_TLS) -return -1; - - return 0; -} - /* Return TRUE if X contains any TLS symbol references. */ bool @@ -8105,7 +8087,19 @@ arm_tls_referenced_p (rtx x) if (! TARGET_HAVE_TLS) return false; - return for_each_rtx (x, arm_tls_operand_p_1, NULL); + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, x, ALL) +{ + const_rtx x = *iter; + if (GET_CODE (x) == SYMBOL_REF SYMBOL_REF_TLS_MODEL (x) != 0) + return true; + + /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are + TLS offsets, not real symbol references. */ + if (GET_CODE (x) == UNSPEC XINT (x, 1) == UNSPEC_TLS) + iter.skip_subrtxes (); +} + return false; } /* Implement TARGET_LEGITIMATE_CONSTANT_P.
Re: [ARM] RFA: Use new rtl iterators in arm_cannot_copy_insn
On 05/11/14 11:52, Richard Sandiford wrote: Tested in the same way as the aarch-common.c patch. OK to install? Thanks, Richard gcc/ * config/arm/arm.c (arm_note_pic_base): Delete. (arm_cannot_copy_insn_p): Use FOR_EACH_SUBRTX. OK. R. Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c 2014-11-05 11:48:55.030053470 + +++ gcc/config/arm/arm.c 2014-11-05 11:48:57.406073646 + @@ -13157,16 +13157,6 @@ tls_mentioned_p (rtx x) /* Must not copy any rtx that uses a pc-relative address. */ -static int -arm_note_pic_base (rtx *x, void *date ATTRIBUTE_UNUSED) -{ - if (GET_CODE (*x) == UNSPEC - (XINT (*x, 1) == UNSPEC_PIC_BASE - || XINT (*x, 1) == UNSPEC_PIC_UNIFIED)) -return 1; - return 0; -} - static bool arm_cannot_copy_insn_p (rtx_insn *insn) { @@ -13175,7 +13165,16 @@ arm_cannot_copy_insn_p (rtx_insn *insn) if (recog_memoized (insn) == CODE_FOR_tlscall) return true; - return for_each_rtx (PATTERN (insn), arm_note_pic_base, NULL); + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, PATTERN (insn), ALL) +{ + const_rtx x = *iter; + if (GET_CODE (x) == UNSPEC +(XINT (x, 1) == UNSPEC_PIC_BASE + || XINT (x, 1) == UNSPEC_PIC_UNIFIED)) + return true; +} + return false; } enum rtx_code
Re: [AArch64] RFA: Use new rtl iterators in arm_cannot_copy_insn
On 05/11/14 11:53, Richard Sandiford wrote: This is part of a series to remove uses of for_each_rtx from the ports. Tested by making sure there were no code changes for gcc.dg, gcc.c-torture and g++.dg for aarch64-linux-gnu. OK to install? Thanks, Richard gcc/ * config/aarch64/aarch64.c: Include rtl-iter.h. (aarch64_tls_operand_p_1): Delete. (aarch64_tls_operand_p): Use FOR_EACH_SUBRTX. OK. R. Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c 2014-11-02 19:59:26.977231633 + +++ gcc/config/aarch64/aarch64.c 2014-11-05 11:48:59.982095520 + @@ -2791,28 +2791,23 @@ aarch64_output_mi_thunk (FILE *file, tre reload_completed = 0; } -static int -aarch64_tls_operand_p_1 (rtx *x, void *data ATTRIBUTE_UNUSED) -{ - if (GET_CODE (*x) == SYMBOL_REF) -return SYMBOL_REF_TLS_MODEL (*x) != 0; - - /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are - TLS offsets, not real symbol references. */ - if (GET_CODE (*x) == UNSPEC - XINT (*x, 1) == UNSPEC_TLS) -return -1; - - return 0; -} - static bool aarch64_tls_referenced_p (rtx x) { if (!TARGET_HAVE_TLS) return false; - - return for_each_rtx (x, aarch64_tls_operand_p_1, NULL); + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, x, ALL) +{ + const_rtx x = *iter; + if (GET_CODE (x) == SYMBOL_REF SYMBOL_REF_TLS_MODEL (x) != 0) + return true; + /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are + TLS offsets, not real symbol references. */ + if (GET_CODE (x) == UNSPEC XINT (x, 1) == UNSPEC_TLS) + iter.skip_subrtxes (); +} + return false; }
Re: [PATCH] RTEMS: select SPARC multilibs
On November 7, 2014 2:40:43 AM CST, Eric Botcazou ebotca...@adacore.com wrote: I think this would be good for 4.8, 4.9 and trunk. 2014-11-06 Daniel Hellstrom dan...@gaisler.com * config.gcc (sparc-*-rtems*): Clean away unused t-elf * config/sparc/t-rtems: Add leon3v7 and muser-mode multilibs OK everywhere as far as I'm concerned but the RTEMS folks have the final say. Fine with me. Does spatc-elf need a refresh on its multilibs? --joel
[PATCH] Fix PR63605
The following fixes a bogus folding when applied to vectors. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2014-11-07 Richard Biener rguent...@suse.de PR tree-optimization/63605 * fold-const.c (fold_binary_loc): Properly use element_precision for types that may not be scalar. * gcc.dg/vect/pr63605.c: New testcase. Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 217214) +++ gcc/fold-const.c(working copy) @@ -12817,7 +12729,7 @@ fold_binary_loc (location_t loc, tree arg00 = TREE_OPERAND (arg0, 0); tree arg01 = TREE_OPERAND (arg0, 1); tree itype = TREE_TYPE (arg00); - if (wi::eq_p (arg01, TYPE_PRECISION (itype) - 1)) + if (wi::eq_p (arg01, element_precision (itype) - 1)) { if (TYPE_UNSIGNED (itype)) { Index: gcc/testsuite/gcc.dg/vect/pr63605.c === --- gcc/testsuite/gcc.dg/vect/pr63605.c (revision 0) +++ gcc/testsuite/gcc.dg/vect/pr63605.c (working copy) @@ -0,0 +1,22 @@ +/* { dg-do run } */ + +#include tree-vect.h + +extern void abort (void); + +int a, b[8] = { 2, 0, 0, 0, 0, 0, 0, 0 }, c[8]; + +int +main () +{ + int d; + check_vect (); + for (; a 8; a++) +{ + d = b[a] 1; + c[a] = d != 0; +} + if (c[0] != 1) +abort (); + return 0; +}
Re: [Patch] Fix PR61889 for the w64-mingw32 case
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Honza, Jan, sorry for my late reply. I'm under heavy workload at the moment. Am 19.10.2014 12:02, schrieb Jan Hubicka: Honza, not sure if this patch is idea, but this will unblock mingw build problems. Can this one get in? It's important to have a solution for the mingw build problems! Hmm, the patch is somewhat ugly and I do not know why MingW32 defines mkdir macro and how. If Kai Tietz or other MingW32 maintainer is OK about it, the patch is OK. The first part of the patch changing gcov-tool.c is correct IMHO. For the second part the situation is more complex, see below. And no, mingw doesn't define the mkdir macro, it's in system.h, see below. Honza thanks, David On Wed, Sep 24, 2014 at 8:22 AM, Rainer Emrich rai...@emrich-ebersheim.de wrote: The following patch fixes PR61889 for x86_64-w64-mingw32. Details can be found on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61889 The patch was bootstrapped on x86_64-w64-mingw32. If patch the patch is ok, Kai would you apply, please? Rainer 2014-09-24 Rainer Emrich rai...@emrich-ebersheim.de PR gcov-profile/61889 * gcc/gcov-tool.c: Remove wrong #if !defined(_WIN32) * libgcc/libgcov-driver-system.c: undefine clashing macro for mkdir Index: gcc/gcov-tool.c === --- gcc/gcov-tool.c (Revision 215554) +++ gcc/gcov-tool.c (Arbeitskopie) @@ -89,11 +89,7 @@ gcov_output_files (const char *out, stru /* Try to make directory if it doesn't already exist. */ if (access (out, F_OK) == -1) { -#if !defined(_WIN32) if (mkdir (out, S_IRWXU | S_IRWXG | S_IRWXO) == -1 errno != EEXIST) -#else - if (mkdir (out) == -1 errno != EEXIST) -#endif fatal_error (Cannot make directory %s, out); } else unlink_profile_dir (out); This part is correct IMHO. gov-tool.c includes system.h which has: /* Some systems have mkdir that takes a single argument. */ #ifdef MKDIR_TAKES_ONE_ARG # define mkdir(a,b) mkdir (a) #endif MKDIR_TAKES_ONE_ARG is defined for mingw! Index: libgcc/libgcov-driver-system.c === --- libgcc/libgcov-driver-system.c (Revision 215554) +++ libgcc/libgcov-driver-system.c (Arbeitskopie) @@ -66,6 +66,9 @@ create_file_directory (char *filename) #ifdef TARGET_POSIX_IO mkdir (filename, 0755) == -1 #else +#ifdef mkdir +#undef mkdir +#endif mkdir (filename) == -1 #endif /* The directory might have been made by another process. */ For this part the systuation is more complex. libgcov-driver-system.c is included by libgcov-driver.c which compiled in the gcc subdirectory for linking with the gcov-tool and in the libgcc subdirectory for inclusion in the libgcov.a archive. libgcov-driver.c includes libgcov.h, which has the following: #ifndef IN_GCOV_TOOL /* About the target. */ /* This path will be used by libgcov runtime. */ #include tconfig.h #include tsystem.h #include coretypes.h #include tm.h #include libgcc_tm.h . . . #else /* IN_GCOV_TOOL */ /* About the host. */ /* This path will be compiled for the host and linked into gcov-tool binary. */ #include config.h #include system.h #include coretypes.h #include tm.h So, in the case compiling for gcov-tool system.h get's included which defines the offending mkdir marco. In the case compiling for libgcov.a system.h is not included so ther's no issue. I don't know how to solve this in a clean way. This design looks a little bit ugly. Cheers Rainer -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJUXL/WAAoJEB3HOsWs+KJbjfwIAKvwvWwYqwHR08JyWSEyxXkX psSPwpbwE8WYc6CVYb9nGpGG+zgRpUqxoTXvKS5HuTH81y8Y+/CffFdpSLUCGMMV OjpDOVwE6EUhusdZ6FKX94/vkKlF9L9HPpPAf3+045tyqqrHcVfBtujBTwG8ftiK qHCo1I7ZK8cTn3x4bnoe5nHUGFnBZ0Iz4Ch0MjdPzdhDwd5DHgJmr/BUHFIJOWSR m/Fd9oGmHrhKgG0NbuWUNTed6sgmPrb3rFZm22d4q4xHyiRZvoVkvpxvz3FY5HWE XJRtbEPPiYrXgJzqAxOiTw6IpHtUR9mXSBOTdm7TMkUl5or1H+96XuHSojCvgDE= =Hz1j -END PGP SIGNATURE-
Re: [PATCH][RFC] Report pass we are ICEing in
On Fri, 7 Nov 2014, Richard Biener wrote: + if (current_pass current_pass-name) +inform (UNKNOWN_LOCATION, executing pass `%s', current_pass-name); %qs of course. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] PR 63721 IPA ICF cause atomic-comp-swap-release-acquire.c ICE
On 11/07/2014 10:52 AM, Jan Hubicka wrote: On 11/05/14 07:09, Jiong Wang wrote: the same ICE will happen on x86-64, if compile with -O2 -fPIC. the reason is for the following two functions, they are identical, so IPA-ICF pass try to transform the second function to call the first one directly. int atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (int a, int b) { return __atomic_compare_exchange (v, a, b, STRONG, __ATOMIC_RELEASE, __ATOMIC_ACQUIRE); } int atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b) { return __atomic_compare_exchange_n (v, a, b, STRONG, __ATOMIC_RELEASE, __ATOMIC_ACQUIRE); } while during this transformation, looks like there are something wrong with the function argument handling. take a for example, because later there are a, so it's marked as addressable. while after transformation, if we turn the second function into int atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b) { return atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (a, b) } then argument a is no longer addressable. so, in cgraph_node::release_body, when making the wrapper, except clearing the function body, we should also clear the addressable flag for function args because they are decided by the function body which is cleared. bootstrap ok on x86-64 and no regression. bootstrap ok on aarch64 juno. ICE gone away on arm x86-64 ok for trunk? gcc/ PR tree-optimization/63721 * cgraph.c (cgraph_node::release_body): Clear addressable flag for function args. While I understand the need to clear the addressable flag, I think release_body probably isn't the best place to do this. Seems to me that ought to happen when we emit the thunk or otherwise transform the body into something that doesn't take the address of those parameters. Yep, I would just move it into expand_thunk - the TREE_ADDRESSABLE bits are not really well defined before we build the gimple body. Honza jeff Hello. I think the bug is a duplicate of PR63580 and there's working patch that can bootstrap on x86_64-linux and no regression has been seen. Ready for trunk? Thanks, Martin gcc/ChangeLog: 2014-11-07 Martin Liska mli...@suse.cz * cgraphunit.c (cgraph_node::create_wrapper): TREE_ADDRESSABLE is set to false for a newly created thunk. gcc/testsuite/ChangeLog: 2014-11-07 Martin Liska mli...@suse.cz * g++.dg/ipa/pr63580.C: New test. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 6f61f5c..89c96dc 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -2342,6 +2342,14 @@ cgraph_node::create_wrapper (cgraph_node *target) cgraph_edge *e = create_edge (target, NULL, 0, CGRAPH_FREQ_BASE); +tree arguments = DECL_ARGUMENTS (decl); + +while (arguments) + { + TREE_ADDRESSABLE (arguments) = false; + arguments = TREE_CHAIN (arguments); + } + expand_thunk (false, true); e-call_stmt_cannot_inline_p = true; diff --git a/gcc/testsuite/g++.dg/ipa/pr63580.C b/gcc/testsuite/g++.dg/ipa/pr63580.C new file mode 100644 index 000..904195a --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr63580.C @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-ipa-icf } */ + +struct A +{ +}; +template class L, class R A operator%(L, R); +template class A0, class A1, class A2, class A3 +void make_tuple (A0 , A1, A2, A3); +A +bar (int p1, char p2, int p3, double p4) +{ + A a; + make_tuple (p1, p2, p3, p4); + return int; char; string; double; % a; +} +A +foo (int p1, char p2, int p3, double p4) +{ + A b; + make_tuple (p1, p2, p3, p4); + return int; char; string; double; % b; +} + +/* { dg-final { scan-ipa-dump Equal symbols: 1 icf } } */ +/* { dg-final { cleanup-ipa-dump icf } } */
Re: [PATCH, testsuite, ARM] Check lr other than r3
On 03/11/14 08:18, Zhenqiang Chen wrote: Hi, pr45701-1.c FAIL for all tests. The patch updates it to check lr other than r3, based on the comments in arm_compute_save_reg_mask, /* ... Otherwise if we do not use the link register we do not need to save it. If we are pushing other registers onto the stack however, we can save an instruction in the epilogue by pushing the link register now and then popping it back into the PC. This incurs extra memory accesses though, so we only do it when optimizing for size, and only if we know that we will not need a fancy return sequence. */ The updated case PASS for Cortex-M0/M4 and Cortext-A15 (THUMB and ARM modes). OK for trunk? Thanks! -Zhenqiang testsuite/ChangeLog: 2014-11-03 Zhenqiang Chen zhenqiang.c...@arm.com * gcc.target/arm/pr45701-1.c: Check LR used. Have you checked that this doesn't cause regressions on ARMv4T? I suspect the code was originally intended to support targets where interworking was not trivial. R. diff --git a/gcc/testsuite/gcc.target/arm/pr45701-1.c b/gcc/testsuite/gcc.target/arm/pr45701-1.c index 2c690d5..c087cfc 100644 --- a/gcc/testsuite/gcc.target/arm/pr45701-1.c +++ b/gcc/testsuite/gcc.target/arm/pr45701-1.c @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-skip-if { ! { arm_thumb1_ok || arm_thumb2_ok } } } */ /* { dg-options -mthumb -Os } */ -/* { dg-final { scan-assembler push\t\{r3 } } */ +/* { dg-final { scan-assembler lr\} } } */ /* { dg-final { scan-assembler-not r8 } } */ extern int hist_verify;
[PATCH] AIX: Filename-based shared library versioning for libgcc_s
Hi David (et al)! The upcoming initial release of gcc-5 feels like a good opportunity for the AIX port of gcc to introduce optional support for the important Linux-known feature I'd call filename-based shared library versioning, aka. SONAME. We have had some discussion in https://gcc.gnu.org/PR52623 already, and I've found it most descriptive for the configure option to read: --with-aix-soname=aix|svr4|both where 'aix' is the current situation (default), 'svr4' the new variant only, and 'both' for backwards compatibility - outlined in the install.texi diff. To allow for backwards compatibility at all, I'd provide the 'svr4' variant with 'runtime linking' enabled only, as this is true for Linux/SVR4 anyway. As I'm using the 'svr4' variant for a quite large portable C/C++ application already (via Gentoo Prefix) on AIX, I can tell this kind of filename-based versioning of shared libraries does really work as expected. Besides adding some documentation, attached patch is for libgcc_s only, as for other libraries I'd prefer to get the --with-aix-soname support via upstream libtool, where I don't have the 'both'-support ready yet. I have to rebase the existing 'svr4' patches anyway - currently these do --enable-aix-soname, found in https://github.com/haubi/libtool/compare/aix-soname But even if not ready for normal use yet, I'd love to see the gcc-5 release start introducing 'aix-soname' - allowing for more Linux with AIX ;) Thoughts? Thank you! /haubi/ From 9f6fd44eddf3b0c43f0472c172d6420b8b91b7db Mon Sep 17 00:00:00 2001 From: Michael Haubenwallner michael.haubenwall...@salomon.at Date: Fri, 16 Mar 2012 14:49:20 +0100 Subject: [PATCH] AIX: Filename-based shlib versioning for libgcc_s 2012-11-05 Michael Haubenwallner michael.haubenwall...@ssi-schaefer.com (libgcc_s) Optional filename-based shared library versioning on AIX. * gcc/doc/install.texi: Describe --with-aix-soname option. * Makefile.in (with_aix_soname): Define. * config/rs6000/t-slibgcc-aix: Support filename-based versioning. * configure.in: Accept --with-aix-soname=aix|svr4|both option. * configure: Recreate. --- gcc/doc/install.texi | 102 + libgcc/Makefile.in | 1 + libgcc/config/rs6000/t-slibgcc-aix | 86 +-- libgcc/configure | 28 ++ libgcc/configure.ac| 17 +++ 5 files changed, 219 insertions(+), 15 deletions(-) diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 3df78ff..161f7e5 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -1414,6 +1414,102 @@ particularly useful if you intend to use several versions of GCC in parallel. This is currently supported by @samp{libgfortran}, @samp{libjava}, @samp{libstdc++}, and @samp{libobjc}. +@item @anchor{WithAixSoname}--with-aix-soname=@samp{aix}, @samp{svr4} or @samp{both} +Traditional AIX shared library versioning (versioned @code{Shared Object} +files as members of unversioned @code{Archive Library} files named +@samp{lib.a}) causes numerous headaches for package managers. However, +@code{Import Files} as members of @code{Archive Library} files allow for +@strong{filename-based versioning} of shared libraries as seen on Linux/SVR4, +where this is called the SONAME. But as they prevent static linking, +@code{Import Files} may be used with @code{Runtime Linking} only, where the +linker does search for @samp{libNAME.so} before @samp{libNAME.a} library +filenames with the @samp{-lNAME} linker flag. + +@anchor{AixLdCommand}For detailed information please refer to the AIX +@uref{http://www-01.ibm.com/support/knowledgecenter/search/%22the%20ld%20command%2C%20also%20called%20the%20linkage%20editor%20or%20binder%22,,ld +Command} reference. + +As long as shared library creation is enabled, upon: +@table @code +@item --with-aix-soname=aix +@item --with-aix-soname=both + A (traditional AIX) @code{Shared Archive Library} file is created: + @itemize @bullet + @item using the @samp{libNAME.a} filename scheme + @item with the @code{Shared Object} file as archive member named + @samp{libNAME.so.V} (except for @samp{libgcc_s}, where the @code{Shared + Object} file is named @samp{shr.o} for backwards compatibility), which + @itemize @minus + @item is used for runtime loading from inside the @samp{libNAME.a} file + @item is used for dynamic loading via + @code{dlopen(libNAME.a(libNAME.so.V), RTLD_MEMBER)} + @item is used for shared linking + @item is used for static linking, so no separate @code{Static Archive + Library} file is needed + @end itemize + @end itemize +@item --with-aix-soname=both +@item --with-aix-soname=svr4 + A (second) @code{Shared Archive Library} file is created: + @itemize @bullet + @item using the @samp{libNAME.so.V} filename scheme + @item with the @code{Shared Object} file as archive member named + @samp{shr.o}, which + @itemize @minus + @item is created with the @code{-G linker flag}
Re: [AArch64, Docs, Patch] Add reference to ACLE in docs.
On 04/11/14 13:17, Tejas Belagod wrote: On 03/11/14 17:58, Joseph Myers wrote: On Mon, 3 Nov 2014, Tejas Belagod wrote: If I mention in a couple of sentences the level of ACLE support there is in GCC currently, this section will need to be updated every time there is an improvement in ACLE support - I guess we'll just have to remember to remove parts of this section as we do that. Yes, it's generally the case when adding new user-visible features that documentation needs updating. The release notes (gcc-N/changes.html in htdocs) should be updated for any significant new features as well. Thanks. The AArch64 ACLE CRC32 intrinsics were introduced in 4.9, so its not new in 5.0 - https://gcc.gnu.org/gcc-4.9/changes.html. But we've improved AArch64 NEON Intrinsics in 5.0 significantly which deserves a mention. I'll do that in a separate patch. Revised patch to fix extend.texi for ACLE attached. OK for trunk? Thanks, Tejas. 2014-11-04 Tejas Belagod tejas.bela...@arm.com gcc/ * Makefile.in (TEXI_GCC_FILES): Remove arm-acle-intrinsics.texi, arm-neon-intrinsics.texi, aarch64-acle-intrinsics.texi. * doc/aarch64-acle-intrinsics.texi: Remove. * doc/arm-acle-intrinsics.texi: Remove. * doc/arm-neon-intrinsics.texi: Remove. * doc/extend.texi: Consolidate sections AArch64 intrinsics, ARM NEON Intrinsics, ARM ACLE Intrinsics into one ARM C Language Extension section. Add references to public ACLE specification. +Extensions(ACLE) specification, which can be found at Space before parenthesis. +@node ARM C Language Extensions I think this should keep the ACLE in parenthesis, so +@node ARM C Language Extensions (ACLE) +back-ends support CRC32 intrinsics from arm_acle.h. ARM backend's 16-bit @file{} around arm_acle.h. s/ARM/The ARM/ OK with those changes. R.
Re: [ping] libatomic: Fix sub-word CAS synthesis on LP64 targets
On 11/06/2014 09:24 PM, Andrew Waterman wrote: 2014-10-23 Andrew Waterman water...@cs.berkeley.edu * cas_n.c (libat_compare_exchange): Add missing cast. Ok. r~
Re: [gomp4] Set default LIBGOMP_PLUGIN_PATH
On 04-11-14 23:34, Tom de Vries wrote: Thomas, this patch sets LIBGOMP_PLUGIN_PATH to the .libs dir in the build area, if LIBGOMP_PLUGIN_PATH has not been defined. This allows f.i. a gcc build without an accelerator configured, to automatically pick up the host_nonshm plugin. Updated for recent change where plugins in build area moved from libgomp/.libs to libgomp/plugin/.libs. OK for gomp-4_0-branch? Thanks, - Tom 2014-11-03 Tom de Vries t...@codesourcery.com * testsuite/libgomp.oacc-c++/c++.exp: Set default LIBGOMP_PLUGIN_PATH. Print used LIBGOMP_PLUGIN_PATH. * testsuite/libgomp.oacc-c/c.exp: Same. * testsuite/libgomp.oacc-fortran/fortran.exp: Same. --- libgomp/testsuite/libgomp.oacc-c++/c++.exp | 5 + libgomp/testsuite/libgomp.oacc-c/c.exp | 5 + libgomp/testsuite/libgomp.oacc-fortran/fortran.exp | 5 + 3 files changed, 15 insertions(+) diff --git a/libgomp/testsuite/libgomp.oacc-c++/c++.exp b/libgomp/testsuite/libgomp.oacc-c++/c++.exp index 9d5bf0b..c6dbdf0 100644 --- a/libgomp/testsuite/libgomp.oacc-c++/c++.exp +++ b/libgomp/testsuite/libgomp.oacc-c++/c++.exp @@ -73,6 +73,11 @@ if { $lang_test_file_found } { set libstdcxx_includes } +if { ![info exists env(LIBGOMP_PLUGIN_PATH)] } { + set env(LIBGOMP_PLUGIN_PATH) ${blddir}/plugin/.libs +} +puts Using LIBGOMP_PLUGIN_PATH $env(LIBGOMP_PLUGIN_PATH) + # Todo: get list of accelerators from configure options --enable-accelerator. set accels { nvidia host_nonshm } diff --git a/libgomp/testsuite/libgomp.oacc-c/c.exp b/libgomp/testsuite/libgomp.oacc-c/c.exp index 0c31447..265d428 100644 --- a/libgomp/testsuite/libgomp.oacc-c/c.exp +++ b/libgomp/testsuite/libgomp.oacc-c/c.exp @@ -39,6 +39,11 @@ set ld_library_path $always_ld_library_path append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST] set_ld_library_path_env_vars +if { ![info exists env(LIBGOMP_PLUGIN_PATH)] } { +set env(LIBGOMP_PLUGIN_PATH) ${blddir}/plugin/.libs +} +puts Using LIBGOMP_PLUGIN_PATH $env(LIBGOMP_PLUGIN_PATH) + # Todo: get list of accelerators from configure options --enable-accelerator. set accels { nvidia host_nonshm } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp b/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp index 312f947..346fb7d 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp +++ b/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp @@ -66,6 +66,11 @@ if { $lang_test_file_found } { append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST] set_ld_library_path_env_vars +if { ![info exists env(LIBGOMP_PLUGIN_PATH)] } { + set env(LIBGOMP_PLUGIN_PATH) ${blddir}/plugin/.libs +} +puts Using LIBGOMP_PLUGIN_PATH $env(LIBGOMP_PLUGIN_PATH) + # Todo: get list of accelerators from configure options --enable-accelerator. set accels { nvidia host_nonshm } -- 1.9.1
Re: [PATCH] PR 63721 IPA ICF cause atomic-comp-swap-release-acquire.c ICE
jeff Hello. I think the bug is a duplicate of PR63580 and there's working patch that can bootstrap on x86_64-linux and no regression has been seen. Ready for trunk? Thanks, Martin gcc/ChangeLog: 2014-11-07 Martin Liska mli...@suse.cz * cgraphunit.c (cgraph_node::create_wrapper): TREE_ADDRESSABLE is set to false for a newly created thunk. OK, thanks! Honza gcc/testsuite/ChangeLog: 2014-11-07 Martin Liska mli...@suse.cz * g++.dg/ipa/pr63580.C: New test. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 6f61f5c..89c96dc 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -2342,6 +2342,14 @@ cgraph_node::create_wrapper (cgraph_node *target) cgraph_edge *e = create_edge (target, NULL, 0, CGRAPH_FREQ_BASE); +tree arguments = DECL_ARGUMENTS (decl); + +while (arguments) + { + TREE_ADDRESSABLE (arguments) = false; + arguments = TREE_CHAIN (arguments); + } + expand_thunk (false, true); e-call_stmt_cannot_inline_p = true; diff --git a/gcc/testsuite/g++.dg/ipa/pr63580.C b/gcc/testsuite/g++.dg/ipa/pr63580.C new file mode 100644 index 000..904195a --- /dev/null +++ b/gcc/testsuite/g++.dg/ipa/pr63580.C @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-ipa-icf } */ + +struct A +{ +}; +template class L, class R A operator%(L, R); +template class A0, class A1, class A2, class A3 +void make_tuple (A0 , A1, A2, A3); +A +bar (int p1, char p2, int p3, double p4) +{ + A a; + make_tuple (p1, p2, p3, p4); + return int; char; string; double; % a; +} +A +foo (int p1, char p2, int p3, double p4) +{ + A b; + make_tuple (p1, p2, p3, p4); + return int; char; string; double; % b; +} + +/* { dg-final { scan-ipa-dump Equal symbols: 1 icf } } */ +/* { dg-final { cleanup-ipa-dump icf } } */
Re: [patch] Provide a can_compare_and_swap_p target hook.
On 11/07/2014 04:31 AM, Andrew Haley wrote: On 06/11/14 19:05, Andrew MacLeod wrote: 1) Given that the compiler *always* provides support via libatomic now (even if it is via locks), does that mean that VMSupportsCS8_builtin() should always return true? or should we map to that a call to __atomic_always_lock_free() ? (that always gets folded to a true or false at compile time) my guess is the latter? Perhaps so. The problem is that some targets can't do CAS on 64-bit doublewords. with libatomic present, I believe they always can, even if it drops to a lock implementation. I'm sorry, I really can't remember. I can't think of any reason to want to turn off builtin support. You have to remember that all this was written when our support for atomic builtins was seriously flaky and we would just punt back to the user anything we hadn't written yet. No worries, i cant remember why i did something last year, let along 8 years ago :-) I'll take a best stab and we'll see what happens :-) Andrew
Re: [Patch AArch64] Fix PR 63724 - Improve immediate generation
On 11/07/2014 01:02 PM, Ramana Radhakrishnan wrote: + *cost = COSTS_N_INSNS (aarch64_internal_mov_immediate + (gen_rtx_REG (mode, 0), x, false)); } Can't you pass NULL for the register when generate is false? r~
Re: [PATCH] RTEMS: select SPARC multilibs
Hi, Thanks for review and testing! Then I will apply it. Thanks for bringing up the sparc-elf. I think we would want the LEON3 in the t-elf, however I have to verify the LEON newlib C state and mflat etc. before activating that. DanielH On 11/07/2014 01:25 PM, Joel Sherrill wrote: On November 7, 2014 2:40:43 AM CST, Eric Botcazou ebotca...@adacore.com wrote: I think this would be good for 4.8, 4.9 and trunk. 2014-11-06 Daniel Hellstrom dan...@gaisler.com * config.gcc (sparc-*-rtems*): Clean away unused t-elf * config/sparc/t-rtems: Add leon3v7 and muser-mode multilibs OK everywhere as far as I'm concerned but the RTEMS folks have the final say. Fine with me. Does spatc-elf need a refresh on its multilibs? --joel
Re: [PATCH] PR 63721 IPA ICF cause atomic-comp-swap-release-acquire.c ICE
On 07/11/14 12:53, Martin Liška wrote: On 11/07/2014 10:52 AM, Jan Hubicka wrote: On 11/05/14 07:09, Jiong Wang wrote: the same ICE will happen on x86-64, if compile with -O2 -fPIC. the reason is for the following two functions, they are identical, so IPA-ICF pass try to transform the second function to call the first one directly. int atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (int a, int b) { return __atomic_compare_exchange (v, a, b, STRONG, __ATOMIC_RELEASE, __ATOMIC_ACQUIRE); } int atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b) { return __atomic_compare_exchange_n (v, a, b, STRONG, __ATOMIC_RELEASE, __ATOMIC_ACQUIRE); } while during this transformation, looks like there are something wrong with the function argument handling. take a for example, because later there are a, so it's marked as addressable. while after transformation, if we turn the second function into int atomic_compare_exchange_n_STRONG_RELEASE_ACQUIRE (int a, int b) { return atomic_compare_exchange_STRONG_RELEASE_ACQUIRE (a, b) } then argument a is no longer addressable. so, in cgraph_node::release_body, when making the wrapper, except clearing the function body, we should also clear the addressable flag for function args because they are decided by the function body which is cleared. bootstrap ok on x86-64 and no regression. bootstrap ok on aarch64 juno. ICE gone away on arm x86-64 ok for trunk? gcc/ PR tree-optimization/63721 * cgraph.c (cgraph_node::release_body): Clear addressable flag for function args. While I understand the need to clear the addressable flag, I think release_body probably isn't the best place to do this. Seems to me that ought to happen when we emit the thunk or otherwise transform the body into something that doesn't take the address of those parameters. Yep, I would just move it into expand_thunk - the TREE_ADDRESSABLE bits are not really well defined before we build the gimple body. Honza jeff Hello. I think the bug is a duplicate of PR63580 and there's working patch that can bootstrap on x86_64-linux and no regression has been seen. thanks. looks good to me, although I think expand_thunk is a better place to fix as there is a loop on arguments already, for (arg = a; arg; arg = DECL_CHAIN (arg)) nargs++; Ready for trunk? Thanks, Martin
[Ada] Fixed-point multiplication in with no floating point
In the general case, a multiplication of two fixed-point values that yield an integer type requires the use of floating point operations. When the types of the operands are identical it is possible to avoid their use by introducing a temporary of the same type, and performing a conversion to integer in a separate step. Transformation is useful when floating-point is unavailable on the target. The following must compile quietly: gcc -c -gnatG t.adb | grep universal_real --- pragma Restrictions (No_Floating_Point); Package T is type T_Real_64_Bis is delta 2.0 ** (-32) range -(2.0 ** 31) .. (2.0 ** 31) - 2.0 ** (-32); procedure Add_Duration_Tst (Dur : in T_Real_64_Bis); end T; --- Package body T is procedure Add_Duration_Tst (Dur : in T_Real_64_Bis) is c_To_us: constant T_Real_64_Bis:= 1_000_000.0; c_Duration_us1 : constant Integer := Integer (Dur * C_To_Us); begin null; end Add_Duration_Tst; end T; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-07 Ed Schonberg schonb...@adacore.com * exp_fixd.adb (Expand_Multiply_Fixed_By_Fixed_Giving_Integer): If the restriction No_Floating_Point is in effect, and the operands have the same type, introduce a temporary to hold the fixed point result, to prevent the use of floating-point operations at run-time. Index: exp_fixd.adb === --- exp_fixd.adb(revision 217215) +++ exp_fixd.adb(working copy) @@ -6,7 +6,7 @@ -- -- -- B o d y -- -- -- --- Copyright (C) 1992-2013, Free Software Foundation, Inc. -- +-- Copyright (C) 1992-2014, Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -29,6 +29,8 @@ with Exp_Util; use Exp_Util; with Nlists; use Nlists; with Nmake;use Nmake; +with Restrict; use Restrict; +with Rident; use Rident; with Rtsfind; use Rtsfind; with Sem; use Sem; with Sem_Eval; use Sem_Eval; @@ -2214,13 +2216,41 @@ --- procedure Expand_Multiply_Fixed_By_Fixed_Giving_Integer (N : Node_Id) is - Left : constant Node_Id := Left_Opnd (N); - Right : constant Node_Id := Right_Opnd (N); + Loc : constant Source_Ptr := Sloc (N); + Left : constant Node_Id:= Left_Opnd (N); + Right : constant Node_Id:= Right_Opnd (N); + begin if Etype (Left) = Universal_Real then Do_Multiply_Fixed_Universal (N, Left = Right, Right = Left); + elsif Etype (Right) = Universal_Real then Do_Multiply_Fixed_Universal (N, Left, Right); + + -- If both types are equal and we need to avoid floating point + -- instructions, it's worth introducing a temporary with the + -- common type, because it may be evaluated more simply without + -- the need for run-time use of floating point. + + elsif Etype (Right) = Etype (Left) +and then Restriction_Active (No_Floating_Point) + then + declare +Temp : constant Entity_Id := Make_Temporary (Loc, 'F'); +Mult : constant Node_Id := Make_Op_Multiply (Loc, Left, Right); +Decl : constant Node_Id := + Make_Object_Declaration (Loc, +Defining_Identifier = Temp, +Object_Definition = New_Occurrence_Of (Etype (Right), Loc), +Expression = Mult); + + begin +Insert_Action (N, Decl); +Rewrite (N, + OK_Convert_To (Etype (N), New_Occurrence_Of (Temp, Loc))); +Analyze_And_Resolve (N, Standard_Integer); + end; + else Do_Multiply_Fixed_Fixed (N); end if;
Re: [PATCH] FreeBSD arm support, EABI.
On 02/11/14 22:11, Andreas Tobler wrote: Hello all, this is a patch which brings support for arm*-*-freebsd* to trunk. The architectures supported are arm-*-*freebsd*, armv6-*-freebsd* and armv6hf-*-freebsd*. armv6 stands for ARM_ARCH == 6, arm stands for ARM_ARCH 6. There is kernel development for armv8 aka. aarch64 ongoing but this is not covered here. This patch only covers 32-bit arm in a basic manner. The patch is built and tested against armv6, armv6hf and arm. The former two tests (lots of) were done on a WANDBOARD-QUAD, the latter on a MARVELL board with 256MB Ram and 200MHz cpu (around 72h+ for a build and test.) Results for armv6hf are on the list. Only one entry but locally I ran several dozens runs... Once if this patch is accepted a few test suite additions will follow. (arm*-*-*eabi* - arm_eabi) The patch itself is also prepared for arm*eb*-*-freebsd*, but I could not test since I lack the HW. I appreciate comments, questions and also an ack if this patch is ok for trunk. TIA, Andreas toplevel: 2014-11-02 Andreas Tobler andre...@gcc.gnu.org * configure.ac: Don't add ${libgcj} for arm*-*-freebsd*. * configure: Regenerate. gcc: 2014-11-02 Andreas Tobler andre...@gcc.gnu.org * config.gcc (arm*-*-freebsd*): New configuration. * config/arm/freebsd.h: New file. * config.host: A extras components for arm*-*-freebsd*. * config/arm/arm.c (arm_init_libfuncs): FreeBSD does not support 8 byte atomics for __ARM_ARCH__ 6 yet. (arm_option_override): FreeBSD has not yet implemented unaligned access. libgcc: 2014-11-02 Andreas Tobler andre...@gcc.gnu.org * config.host (arm*-*-freebsd*): Add new configuration for arm*-*-freebsd*. * config/arm/freebsd-atomic.c: New file. * config/arm/t-freebsd: Likewise. * config/arm/unwind-arm.h: Add __FreeBSD__ to the list of 'PC-relative indirect' OS's. libstdc++: 2014-11-02 Andreas Tobler andre...@gcc.gnu.org * configure.host: Add arm*-*-freebsd* port_specific_symbol_files. This mostly looks OK, but a couple of nits. Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c (revision 217020) +++ gcc/config/arm/arm.c (working copy) @@ -2202,7 +2202,11 @@ { /* For Linux, we have access to kernel support for atomic operations. */ if (arm_abi == ARM_ABI_AAPCS_LINUX) +#ifndef __FreeBSD__ init_sync_libfuncs (2 * UNITS_PER_WORD); +#else +init_sync_libfuncs (UNITS_PER_WORD); +#endif This would be better handled by some refactoring, so that we can eliminate the conditionalized code in the main function. Define something like MAX_SYNC_LIBFUNC_SIZE and then override it in the FreeBSD-specific header. @@ -3036,6 +3040,9 @@ warning (0, target CPU does not support unaligned accesses); unaligned_access = 0; } +#ifdef __FreeBSD__ + unaligned_access = 0; +#endif This really should be fixed in the OS; you're not really supporting the architecture properly if you don't allow this on v6 or later. In the mean time, the code should be moved to SUBTARGET_OVERRIDE_OPTIONS. Index: gcc/config/arm/freebsd.h === --- gcc/config/arm/freebsd.h (revision 0) +++ gcc/config/arm/freebsd.h (working copy) + +/* Use the AAPCS type for wchar_t, override the one from config/freebsd.h. */ +#undef WCHAR_TYPE +#define WCHAR_TYPE (TARGET_AAPCS_BASED ? unsigned int : long int) I don't think you should really be targeting anything that is not AAPCS based; so this should surely collapse to 'unsigned int'.
[Ada] Ghost legality rules and SPARK_Mode
This patch decouples the semantic and legality rules of Ghost entities from the presence of aspect/pragma SPARK_Mode. This way non-SPARK code can utilize Ghost annotations. -- Source -- -- ghost_decl.ads package Ghost_Decl is X : Integer := 0 with Ghost; Y : Integer := X; end Ghost_Decl; -- Compilation and output -- $ gcc -c ghost_decl.ads ghost_decl.ads:3:19: ghost entity cannot appear in this context (SPARK RM 6.9(12)) Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-07 Hristian Kirtchev kirtc...@adacore.com * freeze.adb (Freeze_Entity): Issue an error regardless of the SPARK_Mode when a ghost type is effectively volatile. * sem_ch3.adb (Analyze_Object_Contract): Decouple the checks related to Ghost from SPARK_Mode. * sem_res.adb (Check_Ghost_Policy): Issue an error regardless of the SPARK_Mode when the Ghost policies do not match. * sem_util.adb (Check_Ghost_Completion): Issue an error regardless of the SPARK_Mode when the Ghost policies do not match. Index: sem_ch3.adb === --- sem_ch3.adb (revision 217215) +++ sem_ch3.adb (working copy) @@ -3185,24 +3185,22 @@ Obj_Id); end if; end if; + end if; -if Is_Ghost_Entity (Obj_Id) then + if Is_Ghost_Entity (Obj_Id) then - -- A Ghost object cannot be effectively volatile - -- (SPARK RM 6.9(8)). +-- A Ghost object cannot be effectively volatile (SPARK RM 6.9(8)) - if Is_Effectively_Volatile (Obj_Id) then - SPARK_Msg_N (ghost variable cannot be volatile, Obj_Id); +if Is_Effectively_Volatile (Obj_Id) then + Error_Msg_N (ghost variable cannot be volatile, Obj_Id); - -- A Ghost object cannot be imported or exported - -- (SPARK RM 6.9(8)). +-- A Ghost object cannot be imported or exported (SPARK RM 6.9(8)) - elsif Is_Imported (Obj_Id) then - SPARK_Msg_N (ghost object cannot be imported, Obj_Id); +elsif Is_Imported (Obj_Id) then + Error_Msg_N (ghost object cannot be imported, Obj_Id); - elsif Is_Exported (Obj_Id) then - SPARK_Msg_N (ghost object cannot be exported, Obj_Id); - end if; +elsif Is_Exported (Obj_Id) then + Error_Msg_N (ghost object cannot be exported, Obj_Id); end if; end if; @@ -3256,10 +3254,10 @@ if Is_Ghost_Entity (Obj_Id) then if Is_Exported (Obj_Id) then -SPARK_Msg_N (ghost object cannot be exported, Obj_Id); +Error_Msg_N (ghost object cannot be exported, Obj_Id); elsif Is_Imported (Obj_Id) then -SPARK_Msg_N (ghost object cannot be imported, Obj_Id); +Error_Msg_N (ghost object cannot be imported, Obj_Id); end if; end if; end Analyze_Object_Contract; @@ -4788,8 +4786,6 @@ when Class_Wide_Kind = Set_Ekind(Id, E_Class_Wide_Subtype); - Set_First_Entity (Id, First_Entity (T)); - Set_Last_Entity (Id, Last_Entity(T)); Set_Class_Wide_Type (Id, Class_Wide_Type(T)); Set_Cloned_Subtype (Id, T); Set_Is_Tagged_Type (Id, True); Index: freeze.adb === --- freeze.adb (revision 217223) +++ freeze.adb (working copy) @@ -2398,6 +2398,24 @@ Set_Has_Non_Standard_Rep (Base_Type (Arr), True); Set_Is_Bit_Packed_Array (Base_Type (Arr), True); Set_Is_Packed(Base_Type (Arr), True); + +-- Make sure that we have the necessary routines to +-- implement the packing, and complain now if not. + +declare + CS : constant Int := UI_To_Int (Csiz); + RE : constant RE_Id := Get_Id (CS); + +begin + if RE /= RE_Null + and then not RTE_Available (RE) + then + Error_Msg_CRT +(packing of UI_Image (Csiz) + -bit components, + First_Subtype (Etype (Arr))); + end if; +end; end if; end; end if; @@ -2650,37 +2668,6 @@ Create_Packed_Array_Impl_Type (Arr);
[Ada] Rejecting properly illegal iterator
This patch removes an infinite loop in the compiler, when an Ada 2012 iterator is attempted over an object without proper iterable aspects, and the code is compiled in SPARK mode. Compiling iter.adb must yield: iter.adb:12:21: cannot iterate over R iter.adb:12:21: to iterate directly over the elements of a container, write of Obj --- procedure Iter is pragma SPARK_Mode (On); type R is record X, Y, Z : Integer; end record; Obj : R; function Sum (X : R) return Integer is Result : Integer := 0; begin return Result : Integer := 0 Do for Val in Obj loop Result := Result + Val; end loop; end return; end; begin if Sum (Obj) /= 0 then null; end if; end; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-07 Ed Schonberg schonb...@adacore.com * sem_ch5.adb (Analyze_Iterator_Specification): return if name in iterator does not have any usable aspect for iteration. Index: sem_ch5.adb === --- sem_ch5.adb (revision 217215) +++ sem_ch5.adb (working copy) @@ -2063,6 +2063,10 @@ Error_Msg_NE (\to iterate directly over the elements of a container, write `of `, Name (N), Original_Node (Name (N))); + + -- No point in continuing analysis of iterator spec. + + return; end if; end if;
[Ada] Reject illegal null procedure
In Ada 2012 a null procedure can be a completion, but it cannot be the completion of a previous null procedure with the same profile Compiling p.adb must yield: p.adb:7:04: duplicate body for Q declared at p.ads:6 p.adb:12:04: duplicate body for Q1 declared at p.ads:7 --- package P is function F return Boolean; procedure Q is null; procedure Q1 is null; end P; --- package body P is function F return Boolean is (True); procedure Q is begin null; end Q; procedure Q1 is null; end P; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-07 Ed Schonberg schonb...@adacore.com * sem_ch6.adb (Analyze_Null_Procedure): Reject a null procedure that there is a previous null procedure in scope with a matching profile. Index: sem_ch6.adb === --- sem_ch6.adb (revision 217215) +++ sem_ch6.adb (working copy) @@ -1453,6 +1453,11 @@ -- there are various error checks that are applied on this body -- when it is analyzed (e.g. correct aspect placement). + if Has_Completion (Prev) then +Error_Msg_Sloc := Sloc (Prev); +Error_Msg_NE (duplicate body for declared#, N, Prev); + end if; + Is_Completion := True; Rewrite (N, Null_Body); Analyze (N);
Re: [gomp4] Fix libgomp-oacc.c/lib-66.c testcase
On 04-11-14 23:46, Tom de Vries wrote: Thomas, This patch fixes the libgomp-oacc.c/lib-66.c testcase. It allows the test to run for non-shared mem accelerators, and skips the test otherwise. Fixed path in log message (testsuite/libgomp.oacc-c/lib-66.c - testsuite/libgomp.oacc-c-c++-common/lib-66.c). ok for gomp-4_0-branch? Thanks, - Tom 2014-11-03 Tom de Vries t...@codesourcery.com * testsuite/libgomp.oacc-c-c++-common/lib-66.c: Skip for shared memory accelerators. (main): Use acc_device_default instead of acc_device_nvidia. --- libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c index 360c05b..398dc2a 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c @@ -1,4 +1,5 @@ /* { dg-do run } */ +/* { dg-skip-if { *-*-* } { * } { -DACC_MEM_SHARED=0 } } */ #include string.h #include stdlib.h @@ -12,7 +13,7 @@ main (int argc, char **argv) unsigned char *h; void *d; - acc_init (acc_device_nvidia); + acc_init (acc_device_default); h = (unsigned char *) malloc (N); @@ -41,7 +42,7 @@ main (int argc, char **argv) free (h); - acc_shutdown (acc_device_nvidia); + acc_shutdown (acc_device_default); return 0; } -- 1.9.1
[Ada] Use of Ghost actuals in Ghost subprogram calls
This patch adds a check to ensure that the actual parameter of a Ghost subprogram call whose formal is of mode IN OUT or OUT is Ghost. -- Source -- -- ghost_procs.ads package Ghost_Procs is procedure In_Proc (Formal :Integer) with Ghost; procedure In_Out_Proc (Formal : in out Integer) with Ghost; procedure Out_Proc(Formal :out Integer) with Ghost; end Ghost_Procs; -- ghost_procs.adb package body Ghost_Procs is procedure In_Proc (Formal : Integer) is begin null; In_Proc; procedure In_Out_Proc (Formal : in out Integer) is begin null; In_Out_Proc; procedure Out_Proc (Formal : out Integer) is begin null; Out_Proc; end Ghost_Procs; -- ghost_params.adb with Ghost_Procs; use Ghost_Procs; procedure Ghost_Params is -- 6.9(13) - An out or in out mode actual parameter in a call to a ghost -- subprogram shall be a ghost variable. Ghost_Obj : Integer := 1 with Ghost; Obj : Integer := 2; begin In_Proc (Ghost_Obj); -- OK In_Proc (Obj);-- OK In_Out_Proc (Ghost_Obj); -- OK In_Out_Proc (Obj);-- Error Out_Proc (Ghost_Obj); -- OK Out_Proc (Obj); -- Error end Ghost_Params; --- -- Compilaton and output -- --- $ gcc -c ghost_params.adb ghost_params.adb:15:17: non-ghost variable Obj cannot appear as actual in call to ghost procedure ghost_params.adb:15:17: corresponding formal has mode in out ghost_params.adb:17:14: non-ghost variable Obj cannot appear as actual in call to ghost procedure ghost_params.adb:17:14: corresponding formal has mode out Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-07 Hristian Kirtchev kirtc...@adacore.com * sem_ch3.adb (Analyze_Object_Declaration): Update references to SPARK RM. (Process_Full_View): Update references to SPARK RM. * sem_ch6.adb (Analyze_Generic_Subprogram_Body): Update references to SPARK RM. (Analyze_Subprogram_Body_Helper): Update references to SPARK RM. * sem_ch7.adb (Analyze_Package_Body_Helper): Update references to SPARK RM. * sem_prag.adb (Check_Ghost_Constituent): Update references to SPARK RM. * sem_res.adb (Check_Ghost_Policy): Update references to SPARK RM. (Resolve_Actuals): Ensure that the actual parameter of a Ghost subprogram whose formal is of mode IN OUT or OUT is Ghost. * sem_util.adb (Check_Ghost_Completion): Update references to SPARK RM. Index: sem_ch3.adb === --- sem_ch3.adb (revision 217225) +++ sem_ch3.adb (working copy) @@ -3925,7 +3925,7 @@ -- The Ghost policy in effect at the point of declaration -- and at the point of completion must match - -- (SPARK RM 6.9(14)). + -- (SPARK RM 6.9(15)). if Present (Prev_Entity) and then Is_Ghost_Entity (Prev_Entity) @@ -4112,7 +4112,7 @@ Set_Is_Ghost_Entity (Id); -- The Ghost policy in effect at the point of declaration and at the - -- point of completion must match (SPARK RM 6.9(14)). + -- point of completion must match (SPARK RM 6.9(16)). if Present (Prev_Entity) and then Is_Ghost_Entity (Prev_Entity) then Check_Ghost_Completion (Prev_Entity, Id); @@ -19786,7 +19786,7 @@ Set_Is_Ghost_Entity (Full_T); -- The Ghost policy in effect at the point of declaration and at the - -- point of completion must match (SPARK RM 6.9(14)). + -- point of completion must match (SPARK RM 6.9(15)). Check_Ghost_Completion (Priv_T, Full_T); Index: sem_ch7.adb === --- sem_ch7.adb (revision 217215) +++ sem_ch7.adb (working copy) @@ -735,7 +735,7 @@ Set_Is_Ghost_Entity (Body_Id); -- The Ghost policy in effect at the point of declaration and at the - -- point of completion must match (SPARK RM 6.9(14)). + -- point of completion must match (SPARK RM 6.9(15)). Check_Ghost_Completion (Spec_Id, Body_Id); end if; Index: sem_prag.adb === --- sem_prag.adb(revision 217226) +++ sem_prag.adb(working copy) @@ -23473,7 +23473,7 @@ -- The Ghost policy in effect at the point of abstract -- state declaration and constituent must match - -- (SPARK RM 6.9(15)). + -- (SPARK RM
[Ada] Lower severity of the program's return value in some common cases
In some common use cases, gnatls used to return E_Fatal error code, which was unfriendly for analysis. It has been changed as follows: gnatls -v and default runtime exists, or gnatls -v --RTS=good runtime path - now returns E_Success gnatls -v and no default runtime exists - now returns E_Warnings gnatls, gnatls -h - now returns E_Success Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-07 Vasiliy Fofanov fofa...@adacore.com * gnatls.adb: Lower severity of the program's return value in some common cases. Index: gnatls.adb === --- gnatls.adb (revision 217215) +++ gnatls.adb (working copy) @@ -1663,6 +1663,7 @@ (Default runtime not available. Use --RTS= with a valid runtime); Write_Eol; Write_Eol; + Exit_Status := E_Warnings; end if; Write_Str (Source Search Path:); @@ -1775,10 +1776,11 @@ Usage; else Try_Help; +Exit_Status := E_Fatal; end if; end if; - Exit_Program (E_Fatal); + Exit_Program (Exit_Status); end if; Initialize_ALI;
Re: [PATCH,1/2] Extended if-conversion for loops marked with pragma omp simd.
Richard, Did you have a chance to look at it? Thanks. Yuri. 2014-10-24 14:21 GMT+04:00 Yuri Rumyantsev ysrum...@gmail.com: Richard, Patch containing new core related to extended predication is attached. Here is few comments which explain a main goal of design. 1. I don't want to insert any critical edge splitting since it may lead to less efficient binaries (I remember some performance issue when we designed lazy code motion algorithm in SPARC compiler). 2. One special case of extended PHI node predication was introduced when #arguments is more than 2 but only two arguments are different and one argument has the only occurrence. For such PHI conditional scalar reduction is applied. This is correspondent to the following: if (q1 q2 q3) var++ New function phi_has_two_different_args was introduced to detect such phi. 3. Original algorithm for PHI predication used assumption that at least one incoming edge for blocks containing PHI is not critical - it guarantees that all computations related to predicate of normal edge are already inserted above this block and core related to PHI predication can be inserted at the beginning of block. But this is not true for critical edges for which predicate computations are in the block where code for phi predication must be inserted. So new function find_insertion_point is introduced which is simply found out the last statement in block defining predicates correspondent to all incoming edges and insert phi predication code after it (with some minor exceptions). If you need more comments or something unclear will let me know. Thanks. Yuri. ChangeLog: 2014-10-24 Yuri Rumyantsev ysrum...@gmail.com * tree-if-conv.c (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE instead of loop flag. (if_convertible_bb_p): Allow bb has more than 2 predecessors if FLAG_FORCE_VECTORIZE is true. (if_convertible_bb_p): Delete check that bb has at least one non-critical incoming edge. (phi_has_two_different_args): New function. (is_cond_scalar_reduction): Add argument EXTENDED to choose access to phi arguments. Invoke phi_has_two_different_args to get phi arguments if EXTENDED is true. Change check that block containing reduction statement candidate is predecessor of phi-block since phi may have more than two arguments. (convert_scalar_cond_reduction): Add argument BEFORE to insert statement before/after gsi point. (predicate_scalar_phi): Add argument false (which means non-extended predication) to call of is_cond_scalar_reduction. Add argument true (which correspondent to argument BEFORE) to call of convert_scalar_cond_reduction. (get_predicate_for_edge): New function. (predicate_arbitrary_scalar_phi): New function. (predicate_extended_scalar_phi): New function. (find_insertion_point): New function. (predicate_all_scalar_phis): Add two boolean variables EXTENDED and BEFORE. Initialize EXTENDED to true if BB containing phi has more than 2 predecessors or both incoming edges are critical. Invoke find_phi_replacement_condition and predicate_scalar_phi or find_insertion_point and predicate_extended_scalar_phi depending on EXTENDED value. (insert_gimplified_predicates): Add check that non-predicated block may have statements to insert. Insert predicate of BB just after label if FLAG_FORCE_VECTORIZE is true. (tree_if_conversion): Add initialization of FLAG_FORCE_VECTORIZE which is copy of inner or outer loop field force_vectorize. 2014-10-24 13:12 GMT+04:00 Richard Biener richard.guent...@gmail.com: On Tue, Oct 21, 2014 at 4:34 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: Richard, In my initial design I did such splitting but before start real if-conversion but I decided to not perform it since code size for if-converted loop is growing (number of phi nodes is increased). It is worth noting also that for phi with #nodes 2 we need to get all predicates (except one) to do phi-predication and it means that block containing such phi can have only 1 critical edge. Can you point me to the patch with the special insertion code then? I definitely want to avoid the mess we ran into with the reassoc code clever insertion code. Richard. Thanks. Yuri. 2014-10-21 18:19 GMT+04:00 Richard Biener richard.guent...@gmail.com: On Tue, Oct 21, 2014 at 4:09 PM, Richard Biener richard.guent...@gmail.com wrote: On Tue, Oct 21, 2014 at 3:58 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: Richard, I saw the sources of these functions, but I can't understand why I should use something else? Note that all predicate computations are located in basic blocks ( by design of if-conv) and there is special function that put these computations in bb (insert_gimplified_predicates). Edge contains only predicate not its computations. New function - find_insertion_point() does very simple search - it finds out the latest (in current bb) operand def-stmt of predicates taken from all incoming edges. In original
Re: [Patch AArch64] Fix PR 63724 - Improve immediate generation
On 07/11/14 13:36, Richard Henderson wrote: On 11/07/2014 01:02 PM, Ramana Radhakrishnan wrote: + *cost = COSTS_N_INSNS (aarch64_internal_mov_immediate +(gen_rtx_REG (mode, 0), x, false)); } Can't you pass NULL for the register when generate is false? True, that would work as well. Should have thought of that. regards Ramana r~
Re: [PATCH 2/n] OpenMP 4.0 offloading infrastructure: LTO streaming
Hello Richard, On 05 Nov 13:50, Jakub Jelinek wrote: On Wed, Nov 05, 2014 at 03:46:55PM +0300, Ilya Verbin wrote: + node-register_symbol (); LGTM. Are you ok with the patch? Jakub -- Thanks, K
Re: [PATCH] AArch64: Add TARGET_SCHED_REASSOCIATION_WIDTH
On 29/10/14 12:55, Wilco Dijkstra wrote: This patch adds the TARGET_SCHED_REASSOCIATION_WIDTH hook. Separate settings for integer, floating point and vector modes are supported via the CPU tuning parameters. Setting the FP reassociation width to 4 improves FP performance on SPEC2000 by ~1.3%. OK for commit? ChangeLog: 2014-10-29 Wilco Dijkstra wdijk...@arm.com * gcc/config/aarch64/aarch64-protos.h (tune-params): Add reasociation tuning parameters. * gcc/config/aarch64/aarch64.c (TARGET_SCHED_REASSOCIATION_WIDTH): Define. (aarch64_reassociation_width): New function. (generic_tunings) Add reassociation tuning parameters. (cortexa53_tunings): Likewise. (cortexa57_tunings): Likewise. (thunderx_tunings): Likewise. If all cores seem to benefit from FP reassociation set to 4, then it seems odd that 4 is not also the default for generic. Andrew, you may need to pick a target-specific value for ThunderX; I think Wilco has just picked something that seems plausible because he needs to put a real value in there. What happens if the integer and vector numbers are bumped up? I'd have thought that integer numbers 1 would be appropriate on all dual-issue or greater cores. R. --- gcc/config/aarch64/aarch64-protos.h | 3 +++ gcc/config/aarch64/aarch64.c| 34 +++--- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 810644c..9c03f7b 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -170,6 +170,9 @@ struct tune_params const struct cpu_vector_cost *const vec_costs; const int memmov_cost; const int issue_rate; + const int int_reassoc_width; + const int fp_reassoc_width; + const int vec_reassoc_width; }; HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index e6cd5eb..4d67722 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -309,7 +309,10 @@ static const struct tune_params generic_tunings = generic_regmove_cost, generic_vector_cost, NAMED_PARAM (memmov_cost, 4), - NAMED_PARAM (issue_rate, 2) + NAMED_PARAM (issue_rate, 2), + 1, /* int_reassoc_width. */ + 1, /* fp_reassoc_width. */ + 1 /* vec_reassoc_width. */ }; static const struct tune_params cortexa53_tunings = @@ -319,7 +322,10 @@ static const struct tune_params cortexa53_tunings = cortexa53_regmove_cost, generic_vector_cost, NAMED_PARAM (memmov_cost, 4), - NAMED_PARAM (issue_rate, 2) + NAMED_PARAM (issue_rate, 2), + 1,/* int_reassoc_width. */ + 4,/* fp_reassoc_width. */ + 1 /* vec_reassoc_width. */ }; static const struct tune_params cortexa57_tunings = @@ -329,7 +335,10 @@ static const struct tune_params cortexa57_tunings = cortexa57_regmove_cost, cortexa57_vector_cost, NAMED_PARAM (memmov_cost, 4), - NAMED_PARAM (issue_rate, 3) + NAMED_PARAM (issue_rate, 3), + 1,/* int_reassoc_width. */ + 4,/* fp_reassoc_width. */ + 1 /* vec_reassoc_width. */ }; static const struct tune_params thunderx_tunings = @@ -340,6 +349,9 @@ static const struct tune_params thunderx_tunings = generic_vector_cost, NAMED_PARAM (memmov_cost, 6), NAMED_PARAM (issue_rate, 2) + 1,/* int_reassoc_width. */ + 4,/* fp_reassoc_width. */ + 1 /* vec_reassoc_width. */ }; /* A processor implementing AArch64. */ @@ -429,6 +441,19 @@ static const char * const aarch64_condition_codes[] = hi, ls, ge, lt, gt, le, al, nv }; +static int +aarch64_reassociation_width (unsigned opc ATTRIBUTE_UNUSED, + enum machine_mode mode) +{ + if (VECTOR_MODE_P (mode)) +return aarch64_tune_params-vec_reassoc_width; + if (INTEGRAL_MODE_P (mode)) +return aarch64_tune_params-int_reassoc_width; + if (FLOAT_MODE_P (mode)) +return aarch64_tune_params-fp_reassoc_width; + return 1; +} + /* Provide a mapping from gcc register numbers to dwarf register numbers. */ unsigned aarch64_dbx_register_number (unsigned regno) @@ -10147,6 +10172,9 @@ aarch64_asan_shadow_offset (void) #undef TARGET_PREFERRED_RELOAD_CLASS #define TARGET_PREFERRED_RELOAD_CLASS aarch64_preferred_reload_class +#undef TARGET_SCHED_REASSOCIATION_WIDTH +#define TARGET_SCHED_REASSOCIATION_WIDTH aarch64_reassociation_width + #undef TARGET_SECONDARY_RELOAD #define TARGET_SECONDARY_RELOAD aarch64_secondary_reload
Re: [ARM] Fix DWARF unwinding breakage
On 17/10/14 09:21, Eric Botcazou wrote: Hi, some OSes, for example VxWorks 6, still use DWARF unwinding on the ARM, which means that they use __builtin_eh_return (EABI unwinding doesn't). The builtin is implemented by means of {arm|thumb}_set_return_address, which can generate a store if LR has been stored on function entry. The problem is that, if this store is FP-based, it is not seen by the RTL DSE pass as being consumed by the SP-based load at the same address on function exit. That's by design in the RTL DSE pass: FP and SP are never substituted for each other by cselib, see for example this comment: /* The only thing that we are not willing to do (this is requirement of dse and if others potential uses need this function we should add a parm to control it) is that we will not substitute the STACK_POINTER_REGNUM, FRAME_POINTER or the HARD_FRAME_POINTER. These expansions confuses the code that notices that stores into the frame go dead at the end of the function and that the frame is not effected by calls to subroutines. If you allow the STACK_POINTER_REGNUM substitution, then dse will think that parameter pushing also goes dead which is wrong. If you allow the FRAME_POINTER or the HARD_FRAME_POINTER then you lose the opportunity to make the frame assumptions. */ if (regno == STACK_POINTER_REGNUM || regno == FRAME_POINTER_REGNUM || regno == HARD_FRAME_POINTER_REGNUM || regno == cfa_base_preserved_regno) return orig; so a FP-based store and a SP-based load are never seen as a RAW dependency. This nevertheless used to work because the blockage insn emitted by the RTL epilogue was acting as a wild load but this got broken by Richard's patch: 2014-03-11 Richard Sandiford rdsandif...@googlemail.com * builtins.c (expand_builtin_setjmp_receiver): Use and clobber hard_frame_pointer_rtx. * cse.c (cse_insn): Remove volatile check. * cselib.c (cselib_process_insn): Likewise. * dse.c (scan_insn): Likewise. which removed the wild load trick. This is visible at -O2 for: void foo (void *c1, void *t1, void *ra) { long offset = uw_install_context_1 (c1, t1); void *handler = __builtin_frob_return_addr (ra); __builtin_unwind_init (); __builtin_eh_return (offset, handler); } The attached patch fixes the breakage by marking the stores as frame related. Tested on ARM/VxWorks, OK for mainline and 4.9 branch? 2014-10-17 Eric Botcazou ebotca...@adacore.com * config/arm/arm.c (arm_set_return_address): Mark the store as frame related, if any. (thumb_set_return_address): Likewise. OK. R. arm_eh_return-2.diff Index: config/arm/arm.c === --- config/arm/arm.c (revision 216252) +++ config/arm/arm.c (working copy) @@ -28952,7 +28952,11 @@ arm_set_return_address (rtx source, rtx addr = plus_constant (Pmode, addr, delta); } - emit_move_insn (gen_frame_mem (Pmode, addr), source); + /* The store needs to be marked as frame related in order to prevent + DSE from deleting it as dead if it is based on fp. */ + rtx insn = emit_move_insn (gen_frame_mem (Pmode, addr), source); + RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (Pmode, LR_REGNUM)); } } @@ -29004,7 +29008,11 @@ thumb_set_return_address (rtx source, rt else addr = plus_constant (Pmode, addr, delta); - emit_move_insn (gen_frame_mem (Pmode, addr), source); + /* The store needs to be marked as frame related in order to prevent + DSE from deleting it as dead if it is based on fp. */ + rtx insn = emit_move_insn (gen_frame_mem (Pmode, addr), source); + RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (Pmode, LR_REGNUM)); } else emit_move_insn (gen_rtx_REG (Pmode, LR_REGNUM), source);
Re: [PATCH/AARCH64] Move the rest of the cost tables to aarch64-cost-tables.h
On 21/10/14 22:37, Andrew Pinski wrote: Hi, To make aarch64.c a little smaller and a little easier to understand, I have moved the rest of the cost tables (cpu_addrcost_table, cpu_regmove_cost, cpu_vector_cost) to aarch64-cost-tables. I also fixed up the inconstancy in the use of __extension__ on some of the structures and not all of them. I used a define to allow it easier instead of having to have #if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 each time. OK? Build and tested on aarch64-elf with no regressions. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64-cost-tables.h (NAMED_PARAM): New define. (NEED_EXTENSION): New define. (generic_addrcost_table): Moved from aarch64.c. (cortexa57_addrcost_table): Likewise. (generic_regmove_cost): Likewise. (cortexa57_regmove_cost): Likewise. (cortexa53_regmove_cost): Likewise. (thunderx_regmove_cost): Likewise. (generic_vector_cost): Likewise. (cortexa57_vector_cost): Likewise. * config/aarch64/aarch64.c (NAMED_PARAM): Delete, moved to aarch64-cost-tables.h. (generic_addrcost_table): Likewise. (cortexa57_addrcost_table): Likewise. (generic_regmove_cost): Likewise. (cortexa57_regmove_cost): Likewise. (cortexa53_regmove_cost): Likewise. (thunderx_regmove_cost): Likewise. (generic_vector_cost): Likewise. (cortexa57_vector_cost): Likewise. (generic_tunings): Use NEED_EXTENSION. (cortexa53_tunings): Likewise. (cortexa57_tunings): Likewise. (thunderx_tunings): Likewise. I don't particularly like the idea of having real data and code in header files. Can't this be moved into aarch64-cost-tables.c? R. movetablestoaarch64-cost.diff.txt Index: config/aarch64/aarch64-cost-tables.h === --- config/aarch64/aarch64-cost-tables.h (revision 216524) +++ config/aarch64/aarch64-cost-tables.h (working copy) @@ -125,7 +125,135 @@ const struct cpu_cost_table thunderx_ext } }; +#if HAVE_DESIGNATED_INITIALIZERS +#define NAMED_PARAM(NAME, VAL) .NAME = (VAL) +#else +#define NAMED_PARAM(NAME, VAL) (VAL) +#endif + +#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 +#define NEED_EXTENSION __extension__ +#else +#define NEED_EXTENSION +#endif + + +/* The address cost models. */ +NEED_EXTENSION +static const struct cpu_addrcost_table generic_addrcost_table = +{ +#if HAVE_DESIGNATED_INITIALIZERS + .addr_scale_costs = +#endif +{ + NAMED_PARAM (hi, 0), + NAMED_PARAM (si, 0), + NAMED_PARAM (di, 0), + NAMED_PARAM (ti, 0), +}, + NAMED_PARAM (pre_modify, 0), + NAMED_PARAM (post_modify, 0), + NAMED_PARAM (register_offset, 0), + NAMED_PARAM (register_extend, 0), + NAMED_PARAM (imm_offset, 0) +}; + +NEED_EXTENSION +static const struct cpu_addrcost_table cortexa57_addrcost_table = +{ +#if HAVE_DESIGNATED_INITIALIZERS + .addr_scale_costs = +#endif +{ + NAMED_PARAM (hi, 1), + NAMED_PARAM (si, 0), + NAMED_PARAM (di, 0), + NAMED_PARAM (ti, 1), +}, + NAMED_PARAM (pre_modify, 0), + NAMED_PARAM (post_modify, 0), + NAMED_PARAM (register_offset, 0), + NAMED_PARAM (register_extend, 0), + NAMED_PARAM (imm_offset, 0), +}; + + +/* Register to Register move costs */ +NEED_EXTENSION +static const struct cpu_regmove_cost generic_regmove_cost = +{ + NAMED_PARAM (GP2GP, 1), + NAMED_PARAM (GP2FP, 2), + NAMED_PARAM (FP2GP, 2), + NAMED_PARAM (FP2FP, 2) +}; + +NEED_EXTENSION +static const struct cpu_regmove_cost cortexa57_regmove_cost = +{ + NAMED_PARAM (GP2GP, 1), + /* Avoid the use of slow int-fp moves for spilling by setting + their cost higher than memmov_cost. */ + NAMED_PARAM (GP2FP, 5), + NAMED_PARAM (FP2GP, 5), + NAMED_PARAM (FP2FP, 2) +}; + +NEED_EXTENSION +static const struct cpu_regmove_cost cortexa53_regmove_cost = +{ + NAMED_PARAM (GP2GP, 1), + /* Avoid the use of slow int-fp moves for spilling by setting + their cost higher than memmov_cost. */ + NAMED_PARAM (GP2FP, 5), + NAMED_PARAM (FP2GP, 5), + NAMED_PARAM (FP2FP, 2) +}; + +NEED_EXTENSION +static const struct cpu_regmove_cost thunderx_regmove_cost = +{ + NAMED_PARAM (GP2GP, 2), + NAMED_PARAM (GP2FP, 2), + NAMED_PARAM (FP2GP, 6), + NAMED_PARAM (FP2FP, 4) +}; + +/* Vector instruction cost model */ +NEED_EXTENSION +static const struct cpu_vector_cost generic_vector_cost = +{ + NAMED_PARAM (scalar_stmt_cost, 1), + NAMED_PARAM (scalar_load_cost, 1), + NAMED_PARAM (scalar_store_cost, 1), + NAMED_PARAM (vec_stmt_cost, 1), + NAMED_PARAM (vec_to_scalar_cost, 1), + NAMED_PARAM (scalar_to_vec_cost, 1), + NAMED_PARAM (vec_align_load_cost, 1), + NAMED_PARAM (vec_unalign_load_cost, 1), + NAMED_PARAM (vec_unalign_store_cost, 1), + NAMED_PARAM (vec_store_cost, 1), + NAMED_PARAM (cond_taken_branch_cost, 3), + NAMED_PARAM (cond_not_taken_branch_cost, 1) +}; + +NEED_EXTENSION +static const struct
Re: [PATCH 5/8] Enable max_issue for AArch32 and AArch64
On 21/10/14 04:31, Maxim Kuvyrkov wrote: Hi Ramana, Hi Marcus, This patch enables max_issue multipass lookahead scheduling for 2nd scheduler pass (or, more pedantically, whenever register-pressure scheduling is not in use). Multipass lookahead scheduling is being enabled for cores that can issue 2 or more instructions per cycle, and it allows scheduler to better exploit multi-issue pipelines. This patch also provides foundation for [upcoming] auto-prefetcher model in the scheduler, which is handled via max_issue. This change requires benchmarking, which I can't easily do at the moment. I would appreciate any benchmarking results that you can share. Bootstrap on aarch64-linux-gnu is in progress. OK to apply, provided no performance or correctness regressions? Thank you, OK. R. -- Maxim Kuvyrkov www.linaro.org 0005-Enable-max_issue-for-AArch32-and-AArch64.patch From bf51463edee1d161ff8e03cf0af0c3ff8b258305 Mon Sep 17 00:00:00 2001 From: Maxim Kuvyrkov maxim.kuvyr...@linaro.org Date: Sat, 29 Mar 2014 07:12:52 +1300 Subject: [PATCH 5/8] Enable max_issue for AArch32 and AArch64 * config/aarch64/aarch64.c (aarch64_sched_first_cycle_multipass_dfa_lookahead): Implement hook. (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD): Define. * config/arm/arm.c (arm_first_cycle_multipass_dfa_lookahead): Implement hook. (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD): Define. --- gcc/config/aarch64/aarch64.c | 12 gcc/config/arm/arm.c | 15 +++ 2 files changed, 27 insertions(+) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 2ad5c28..1512418 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6077,6 +6077,14 @@ aarch64_sched_issue_rate (void) return aarch64_tune_params-issue_rate; } +static int +aarch64_sched_first_cycle_multipass_dfa_lookahead (void) +{ + int issue_rate = aarch64_sched_issue_rate (); + + return issue_rate 1 ? issue_rate : 0; +} + /* Vectorizer cost model target hooks. */ /* Implement targetm.vectorize.builtin_vectorization_cost. */ @@ -10136,6 +10144,10 @@ aarch64_asan_shadow_offset (void) #undef TARGET_SCHED_ISSUE_RATE #define TARGET_SCHED_ISSUE_RATE aarch64_sched_issue_rate +#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD +#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \ + aarch64_sched_first_cycle_multipass_dfa_lookahead + #undef TARGET_TRAMPOLINE_INIT #define TARGET_TRAMPOLINE_INIT aarch64_trampoline_init diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 1ee0eb3..0f15c99 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -246,6 +246,7 @@ static void arm_option_override (void); static unsigned HOST_WIDE_INT arm_shift_truncation_mask (enum machine_mode); static bool arm_cannot_copy_insn_p (rtx_insn *); static int arm_issue_rate (void); +static int arm_first_cycle_multipass_dfa_lookahead (void); static void arm_output_dwarf_dtprel (FILE *, int, rtx) ATTRIBUTE_UNUSED; static bool arm_output_addr_const_extra (FILE *, rtx); static bool arm_allocate_stack_slots_for_args (void); @@ -591,6 +592,10 @@ static const struct attribute_spec arm_attribute_table[] = #undef TARGET_SCHED_ISSUE_RATE #define TARGET_SCHED_ISSUE_RATE arm_issue_rate +#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD +#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \ + arm_first_cycle_multipass_dfa_lookahead + #undef TARGET_MANGLE_TYPE #define TARGET_MANGLE_TYPE arm_mangle_type @@ -29888,6 +29893,16 @@ arm_issue_rate (void) } } +/* Return how many instructions should scheduler lookahead to choose the + best one. */ +static int +arm_first_cycle_multipass_dfa_lookahead (void) +{ + int issue_rate = arm_issue_rate (); + + return issue_rate 1 ? issue_rate : 0; +} + /* A table and a function to perform ARM-specific name mangling for NEON vector types in order to conform to the AAPCS (see Procedure Call Standard for the ARM Architecture, Appendix A). To qualify
[gimple-classes, committed 0/6] Use gassign in 6 tree-ssa-* files
I've pushed the following 6 patches to the git branch dmalcolm/gimple-classes. This is part of ongoing work on the branch to make all gimple_assign_* accessors take a gassign *, rather than a gimple. Successfully bootstrappedregrtested the combination of the 6 patches upon the branch on x86_64-unknown-linux-gnu (Fedora 20) - same results relative to an unpatched control bootstrap of trunk's r216746. David Malcolm (6): tree-ssa-sink.c: Use gassign tree-ssa-strlen.c: Use gassign tree-ssa-structalias.c: Use gassign tree-ssa-tail-merge.c: Use gassign tree-ssa-ter.c: Use gassign tree-ssa-threadedge.c: Use gassign gcc/ChangeLog.gimple-classes | 56 gcc/tree-ssa-sink.c | 9 +++ gcc/tree-ssa-strlen.c| 49 -- gcc/tree-ssa-structalias.c | 35 ++- gcc/tree-ssa-tail-merge.c| 8 --- gcc/tree-ssa-ter.c | 16 - gcc/tree-ssa-threadedge.c| 33 ++ 7 files changed, 139 insertions(+), 67 deletions(-) -- 1.7.11.7
[gimple-classes, committed 3/6] tree-ssa-structalias.c: Use gassign
gcc/ChangeLog.gimple-classes: * tree-ssa-structalias.c (find_func_aliases): Replace is_gimple_assign with a dyn_cast, introducing local gassign * t_assign, using it in place of t for typesafety. (find_func_clobbers): Add checked cast. --- gcc/ChangeLog.gimple-classes | 7 +++ gcc/tree-ssa-structalias.c | 35 +++ 2 files changed, 26 insertions(+), 16 deletions(-) diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes index a0e7c48..f43df63 100644 --- a/gcc/ChangeLog.gimple-classes +++ b/gcc/ChangeLog.gimple-classes @@ -1,5 +1,12 @@ 2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-structalias.c (find_func_aliases): Replace + is_gimple_assign with a dyn_cast, introducing local gassign * + t_assign, using it in place of t for typesafety. + (find_func_clobbers): Add checked cast. + +2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-strlen.c (find_equal_ptrs): Replace is_gimple_assign with a dyn_cast, strengthening local stmt from gimple to gassign *. diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c index e1f0e66..5d22752 100644 --- a/gcc/tree-ssa-structalias.c +++ b/gcc/tree-ssa-structalias.c @@ -4661,11 +4661,13 @@ find_func_aliases (struct function *fn, gimple origt) /* Otherwise, just a regular assignment statement. Only care about operations with pointer result, others are dealt with as escape points if they have pointer operands. */ - else if (is_gimple_assign (t)) + else if (gassign *t_assign = dyn_cast gassign * (t)) { /* Otherwise, just a regular assignment statement. */ - tree lhsop = gimple_assign_lhs (t); - tree rhsop = (gimple_num_ops (t) == 2) ? gimple_assign_rhs1 (t) : NULL; + tree lhsop = gimple_assign_lhs (t_assign); + tree rhsop = ((gimple_num_ops (t_assign) == 2) + ? gimple_assign_rhs1 (t_assign) + : NULL); if (rhsop TREE_CLOBBER_P (rhsop)) /* Ignore clobbers, they don't actually store anything into @@ -4675,7 +4677,7 @@ find_func_aliases (struct function *fn, gimple origt) do_structure_copy (lhsop, rhsop); else { - enum tree_code code = gimple_assign_rhs_code (t); + enum tree_code code = gimple_assign_rhs_code (t_assign); get_constraint_for (lhsop, lhsc); @@ -4684,20 +4686,21 @@ find_func_aliases (struct function *fn, gimple origt) assume the value is not produced to transfer a pointer. */ ; else if (code == POINTER_PLUS_EXPR) - get_constraint_for_ptr_offset (gimple_assign_rhs1 (t), - gimple_assign_rhs2 (t), rhsc); + get_constraint_for_ptr_offset (gimple_assign_rhs1 (t_assign), + gimple_assign_rhs2 (t_assign), + rhsc); else if (code == BIT_AND_EXPR - TREE_CODE (gimple_assign_rhs2 (t)) == INTEGER_CST) + TREE_CODE (gimple_assign_rhs2 (t_assign)) == INTEGER_CST) { /* Aligning a pointer via a BIT_AND_EXPR is offsetting the pointer. Handle it by offsetting it by UNKNOWN. */ - get_constraint_for_ptr_offset (gimple_assign_rhs1 (t), + get_constraint_for_ptr_offset (gimple_assign_rhs1 (t_assign), NULL_TREE, rhsc); } else if ((CONVERT_EXPR_CODE_P (code) -!(POINTER_TYPE_P (gimple_expr_type (t)) +!(POINTER_TYPE_P (gimple_expr_type (t_assign)) !POINTER_TYPE_P (TREE_TYPE (rhsop - || gimple_assign_single_p (t)) + || gimple_assign_single_p (t_assign)) get_constraint_for_rhs (rhsop, rhsc); else if (code == COND_EXPR) { @@ -4705,8 +4708,8 @@ find_func_aliases (struct function *fn, gimple origt) auto_vecce_s, 2 tmp; struct constraint_expr *rhsp; unsigned i; - get_constraint_for_rhs (gimple_assign_rhs2 (t), rhsc); - get_constraint_for_rhs (gimple_assign_rhs3 (t), tmp); + get_constraint_for_rhs (gimple_assign_rhs2 (t_assign), rhsc); + get_constraint_for_rhs (gimple_assign_rhs3 (t_assign), tmp); FOR_EACH_VEC_ELT (tmp, i, rhsp) rhsc.safe_push (*rhsp); } @@ -4720,10 +4723,10 @@ find_func_aliases (struct function *fn, gimple origt) auto_vecce_s, 4 tmp; struct constraint_expr *rhsp; unsigned i, j; - get_constraint_for_rhs (gimple_assign_rhs1 (t), rhsc); - for (i = 2; i gimple_num_ops (t); ++i) + get_constraint_for_rhs (gimple_assign_rhs1 (t_assign), rhsc); + for (i = 2; i
[gimple-classes, committed 1/6] tree-ssa-sink.c: Use gassign
gcc/ChangeLog.gimple-classes: * tree-ssa-sink.c (statement_sink_location): Rename param stmt to gs, reintroducing stmt as a local gassign * via a dyn_cast for typesafety. --- gcc/ChangeLog.gimple-classes | 6 ++ gcc/tree-ssa-sink.c | 9 + 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes index 382bd3d..2c78ce0 100644 --- a/gcc/ChangeLog.gimple-classes +++ b/gcc/ChangeLog.gimple-classes @@ -1,5 +1,11 @@ 2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-sink.c (statement_sink_location): Rename param stmt + to gs, reintroducing stmt as a local gassign * via a dyn_cast + for typesafety. + +2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-propagate.c (substitute_and_fold_dom_walker::before_dom_children): Add checked cast. Replace is_gimple_assign with a dyn_cast, introducing local diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c index c6d8712..968ab27 100644 --- a/gcc/tree-ssa-sink.c +++ b/gcc/tree-ssa-sink.c @@ -256,13 +256,13 @@ select_best_block (basic_block early_bb, return early_bb; } -/* Given a statement (STMT) and the basic block it is currently in (FROMBB), +/* Given a statement (GS) and the basic block it is currently in (FROMBB), determine the location to sink the statement to, if any. Returns true if there is such location; in that case, TOGSI points to the - statement before that STMT should be moved. */ + statement before that GS should be moved. */ static bool -statement_sink_location (gimple stmt, basic_block frombb, +statement_sink_location (gimple gs, basic_block frombb, gimple_stmt_iterator *togsi) { gimple use; @@ -274,7 +274,8 @@ statement_sink_location (gimple stmt, basic_block frombb, imm_use_iterator imm_iter; /* We only can sink assignments. */ - if (!is_gimple_assign (stmt)) + gassign *stmt = dyn_cast gassign * (gs); + if (!stmt) return false; /* We only can sink stmts with a single definition. */ -- 1.7.11.7
[gimple-classes, committed 2/6] tree-ssa-strlen.c: Use gassign
gcc/ChangeLog.gimple-classes: * tree-ssa-strlen.c (find_equal_ptrs): Replace is_gimple_assign with a dyn_cast, strengthening local stmt from gimple to gassign *. (adjust_last_stmt): Likewise, introducing local last_assign and using it in place of last.stmt, and strengthening local def_stmt from gimple to gassign *. (handle_builtin_memcpy): Replace is_gimple_assign with a dyn_cast, strengthening local def_stmt from gimple to gassign *. (handle_pointer_plus): Strengthen local stmt from gimple to gassign *. Add checked cast. (handle_char_store): Strengthen local stmt from gimple to gassign *. (strlen_optimize_stmt): Replace is_gimple_assign with dyn_cast, introducing local gassign * assign_stmt, using it in place of stmt for typesafety. --- gcc/ChangeLog.gimple-classes | 18 gcc/tree-ssa-strlen.c| 49 +++- 2 files changed, 44 insertions(+), 23 deletions(-) diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes index 2c78ce0..a0e7c48 100644 --- a/gcc/ChangeLog.gimple-classes +++ b/gcc/ChangeLog.gimple-classes @@ -1,5 +1,23 @@ 2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-strlen.c (find_equal_ptrs): Replace is_gimple_assign + with a dyn_cast, strengthening local stmt from gimple to + gassign *. + (adjust_last_stmt): Likewise, introducing local last_assign and + using it in place of last.stmt, and strengthening local def_stmt + from gimple to gassign *. + (handle_builtin_memcpy): Replace is_gimple_assign with a dyn_cast, + strengthening local def_stmt from gimple to gassign *. + (handle_pointer_plus): Strengthen local stmt from gimple to + gassign *. Add checked cast. + (handle_char_store): Strengthen local stmt from gimple to + gassign *. + (strlen_optimize_stmt): Replace is_gimple_assign with dyn_cast, + introducing local gassign * assign_stmt, using it in place of + stmt for typesafety. + +2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-sink.c (statement_sink_location): Rename param stmt to gs, reintroducing stmt as a local gassign * via a dyn_cast for typesafety. diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index 43f866f..e4e5099 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -727,8 +727,8 @@ find_equal_ptrs (tree ptr, int idx) return; while (1) { - gimple stmt = SSA_NAME_DEF_STMT (ptr); - if (!is_gimple_assign (stmt)) + gassign *stmt = dyn_cast gassign * (SSA_NAME_DEF_STMT (ptr)); + if (!stmt) return; ptr = gimple_assign_rhs1 (stmt); switch (gimple_assign_rhs_code (stmt)) @@ -824,17 +824,17 @@ adjust_last_stmt (strinfo si, gimple stmt, bool is_strcat) return; } - if (is_gimple_assign (last.stmt)) + if (gassign *last_assign = dyn_cast gassign * (last.stmt)) { gimple_stmt_iterator gsi; - if (!integer_zerop (gimple_assign_rhs1 (last.stmt))) + if (!integer_zerop (gimple_assign_rhs1 (last_assign))) return; - if (stmt_could_throw_p (last.stmt)) + if (stmt_could_throw_p (last_assign)) return; - gsi = gsi_for_stmt (last.stmt); - unlink_stmt_vdef (last.stmt); - release_defs (last.stmt); + gsi = gsi_for_stmt (last_assign); + unlink_stmt_vdef (last_assign); + release_defs (last_assign); gsi_remove (gsi, true); return; } @@ -866,8 +866,8 @@ adjust_last_stmt (strinfo si, gimple stmt, bool is_strcat) } else if (TREE_CODE (len) == SSA_NAME) { - gimple def_stmt = SSA_NAME_DEF_STMT (len); - if (!is_gimple_assign (def_stmt) + gassign *def_stmt = dyn_cast gassign * (SSA_NAME_DEF_STMT (len)); + if (!def_stmt || gimple_assign_rhs_code (def_stmt) != PLUS_EXPR || gimple_assign_rhs1 (def_stmt) != last.len || !integer_onep (gimple_assign_rhs2 (def_stmt))) @@ -1322,7 +1322,7 @@ handle_builtin_memcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi) if (idx 0) { - gimple def_stmt; + gassign *def_stmt; /* Handle memcpy (x, y, l) where l is strlen (y) + 1. */ si = get_strinfo (idx); @@ -1330,8 +1330,8 @@ handle_builtin_memcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi) return; if (TREE_CODE (len) != SSA_NAME) return; - def_stmt = SSA_NAME_DEF_STMT (len); - if (!is_gimple_assign (def_stmt) + def_stmt = dyn_cast gassign * (SSA_NAME_DEF_STMT (len)); + if (!def_stmt || gimple_assign_rhs_code (def_stmt) != PLUS_EXPR || gimple_assign_rhs1 (def_stmt) != si-length || !integer_onep (gimple_assign_rhs2 (def_stmt))) @@ -1695,7 +1695,7 @@ handle_builtin_memset (gimple_stmt_iterator *gsi) static void
[gimple-classes, committed 5/6] tree-ssa-ter.c: Use gassign
gcc/ChangeLog.gimple-classes: * tree-ssa-ter.c (find_replaceable_in_bb): Replace is_gimple_assign with a dyn_cast, introducing local def_assign and using it in place of def_stmt for typesafety. Add a checked cast. --- gcc/ChangeLog.gimple-classes | 7 +++ gcc/tree-ssa-ter.c | 16 ++-- 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes index 0bd0421..43c05ec 100644 --- a/gcc/ChangeLog.gimple-classes +++ b/gcc/ChangeLog.gimple-classes @@ -1,5 +1,12 @@ 2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-ter.c (find_replaceable_in_bb): Replace + is_gimple_assign with a dyn_cast, introducing local def_assign + and using it in place of def_stmt for typesafety. Add a checked + cast. + +2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-tail-merge.c (same_succ_hash): Add checked cast. (gimple_equal_p): Add checked casts. diff --git a/gcc/tree-ssa-ter.c b/gcc/tree-ssa-ter.c index 96b3959..adbc5f9 100644 --- a/gcc/tree-ssa-ter.c +++ b/gcc/tree-ssa-ter.c @@ -640,14 +640,18 @@ find_replaceable_in_bb (temp_expr_table_p tab, basic_block bb) if (gimple_vdef (stmt)) { gimple def_stmt = SSA_NAME_DEF_STMT (use); - while (is_gimple_assign (def_stmt) - gimple_assign_rhs_code (def_stmt) == SSA_NAME) - def_stmt - = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (def_stmt)); + while (gassign *def_assign = dyn_cast gassign * (def_stmt)) + { + if (gimple_assign_rhs_code (def_assign) != SSA_NAME) + break; + def_stmt + = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (def_assign)); + } if (gimple_vuse (def_stmt) gimple_assign_single_p (def_stmt) - stmt_may_clobber_ref_p (stmt, -gimple_assign_rhs1 (def_stmt))) + stmt_may_clobber_ref_p ( + stmt, + gimple_assign_rhs1 (as_a gassign * (def_stmt { /* For calls, it is not a problem if USE is among call's arguments or say OBJ_TYPE_REF argument, -- 1.7.11.7
[gimple-classes, committed 6/6] tree-ssa-threadedge.c: Use gassign
gcc/ChangeLog.gimple-classes: * tree-ssa-threadedge.c (lhs_of_dominating_assert): Capture result of gimple_assign_single_p as new local gassign * use_assign, using it in place of use_stmt for typesafety. (fold_assignment_stmt): Strengthen param stmt from gimple to gassign *. (record_temporary_equivalences_from_stmts_at_dest): Replace check against GIMPLE_ASSIGN with a dyn_cast, introducing local gassign * assign_stmt, using it in place of stmt for typesafety. Later, use it to capture the result of gimple_assign_single_p, and use it in place of stmt for typesafety. Add checked cast. --- gcc/ChangeLog.gimple-classes | 13 + gcc/tree-ssa-threadedge.c| 33 ++--- 2 files changed, 31 insertions(+), 15 deletions(-) diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes index 43c05ec..c85c138 100644 --- a/gcc/ChangeLog.gimple-classes +++ b/gcc/ChangeLog.gimple-classes @@ -1,5 +1,18 @@ 2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-threadedge.c (lhs_of_dominating_assert): Capture result + of gimple_assign_single_p as new local gassign * use_assign, + using it in place of use_stmt for typesafety. + (fold_assignment_stmt): Strengthen param stmt from gimple to + gassign *. + (record_temporary_equivalences_from_stmts_at_dest): Replace check + against GIMPLE_ASSIGN with a dyn_cast, introducing local gassign * + assign_stmt, using it in place of stmt for typesafety. Later, + use it to capture the result of gimple_assign_single_p, and use it + in place of stmt for typesafety. Add checked cast. + +2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-ter.c (find_replaceable_in_bb): Replace is_gimple_assign with a dyn_cast, introducing local def_assign and using it in place of def_stmt for typesafety. Add a checked diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index d5b2941..c0b1c82 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -132,14 +132,15 @@ lhs_of_dominating_assert (tree op, basic_block bb, gimple stmt) FOR_EACH_IMM_USE_FAST (use_p, imm_iter, op) { + gassign *use_assign; use_stmt = USE_STMT (use_p); if (use_stmt != stmt - gimple_assign_single_p (use_stmt) - TREE_CODE (gimple_assign_rhs1 (use_stmt)) == ASSERT_EXPR - TREE_OPERAND (gimple_assign_rhs1 (use_stmt), 0) == op - dominated_by_p (CDI_DOMINATORS, bb, gimple_bb (use_stmt))) + (use_assign = gimple_assign_single_p (use_stmt)) + TREE_CODE (gimple_assign_rhs1 (use_assign)) == ASSERT_EXPR + TREE_OPERAND (gimple_assign_rhs1 (use_assign), 0) == op + dominated_by_p (CDI_DOMINATORS, bb, gimple_bb (use_assign))) { - return gimple_assign_lhs (use_stmt); + return gimple_assign_lhs (use_assign); } } return op; @@ -243,7 +244,7 @@ record_temporary_equivalences_from_phis (edge e, vectree *stack) May return NULL_TREE if no simplification is possible. */ static tree -fold_assignment_stmt (gimple stmt) +fold_assignment_stmt (gassign *stmt) { enum tree_code subcode = gimple_assign_rhs_code (stmt); @@ -365,8 +366,9 @@ record_temporary_equivalences_from_stmts_at_dest (edge e, /* If this is not a statement that sets an SSA_NAME to a new value, then do not try to simplify this statement as it will not simplify in any way that is helpful for jump threading. */ - if ((gimple_code (stmt) != GIMPLE_ASSIGN - || TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME) + gassign *assign_stmt = dyn_cast gassign * (stmt); + if ((!assign_stmt + || TREE_CODE (gimple_assign_lhs (assign_stmt)) != SSA_NAME) (gimple_code (stmt) != GIMPLE_CALL || gimple_call_lhs (stmt) == NULL_TREE || TREE_CODE (gimple_call_lhs (stmt)) != SSA_NAME)) @@ -435,12 +437,13 @@ record_temporary_equivalences_from_stmts_at_dest (edge e, Handle simple copy operations as well as implied copies from ASSERT_EXPRs. */ - if (gimple_assign_single_p (stmt) - TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME) - cached_lhs = gimple_assign_rhs1 (stmt); - else if (gimple_assign_single_p (stmt) -TREE_CODE (gimple_assign_rhs1 (stmt)) == ASSERT_EXPR) - cached_lhs = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0); + assign_stmt = gimple_assign_single_p (stmt); + if (assign_stmt + TREE_CODE (gimple_assign_rhs1 (assign_stmt)) == SSA_NAME) + cached_lhs = gimple_assign_rhs1 (assign_stmt); + else if (assign_stmt +TREE_CODE (gimple_assign_rhs1 (assign_stmt)) == ASSERT_EXPR) + cached_lhs = TREE_OPERAND (gimple_assign_rhs1 (assign_stmt), 0); else { /* A
[gimple-classes, committed 4/6] tree-ssa-tail-merge.c: Use gassign
gcc/ChangeLog.gimple-classes: * tree-ssa-tail-merge.c (same_succ_hash): Add checked cast. (gimple_equal_p): Add checked casts. --- gcc/ChangeLog.gimple-classes | 5 + gcc/tree-ssa-tail-merge.c| 8 +--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes index f43df63..0bd0421 100644 --- a/gcc/ChangeLog.gimple-classes +++ b/gcc/ChangeLog.gimple-classes @@ -1,5 +1,10 @@ 2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-tail-merge.c (same_succ_hash): Add checked cast. + (gimple_equal_p): Add checked casts. + +2014-11-06 David Malcolm dmalc...@redhat.com + * tree-ssa-structalias.c (find_func_aliases): Replace is_gimple_assign with a dyn_cast, introducing local gassign * t_assign, using it in place of t for typesafety. diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c index 5678657..b822214 100644 --- a/gcc/tree-ssa-tail-merge.c +++ b/gcc/tree-ssa-tail-merge.c @@ -484,7 +484,7 @@ same_succ_hash (const_same_succ e) hstate.add_int (gimple_code (stmt)); if (is_gimple_assign (stmt)) - hstate.add_int (gimple_assign_rhs_code (stmt)); + hstate.add_int (gimple_assign_rhs_code (as_a gassign * (stmt))); if (!is_gimple_call (stmt)) continue; if (gimple_call_internal_p (stmt)) @@ -1172,8 +1172,10 @@ gimple_equal_p (same_succ same_succ, gimple s1, gimple s2) if (TREE_CODE (lhs1) != SSA_NAME TREE_CODE (lhs2) != SSA_NAME) return (operand_equal_p (lhs1, lhs2, 0) -gimple_operand_equal_value_p (gimple_assign_rhs1 (s1), -gimple_assign_rhs1 (s2))); +gimple_operand_equal_value_p (gimple_assign_rhs1 ( + as_a gassign * (s1)), +gimple_assign_rhs1 ( + as_a gassign * (s2; else if (TREE_CODE (lhs1) == SSA_NAME TREE_CODE (lhs2) == SSA_NAME) return vn_valueize (lhs1) == vn_valueize (lhs2); -- 1.7.11.7
Re: [PATCH, ifcvt] Allow CC mode if HAVE_cbranchcc4 (fix s390 build)
Richard Henderson wrote: On 11/06/2014 05:10 PM, Ulrich Weigand wrote: + /* For s390, CC REG is general_operand. But cstorecc4 only + handles CCZ1, which can not handle others like CCU. */ + || GET_MODE_CLASS (GET_MODE (XEXP (cond, 0))) == MODE_CC); I'd like to know more about this. This seems like a mistake in the backend. We do indeed allow the CC register as general_operand, since it has a register class of CC_REGS. I rather meant that cstorecc4 only handles some MODE_CC, not that CC is a general_operand, that seems questionable. Well, it's only for CCZ1mode that we can implement cstorecc4 using a simple IPM / shift sequence. In order to handle generic MODE_CC modes, we'd have to implement a mask check, something like IPM / shift / load immediate / shift / and-immediate. This will usually be slower than then default branch sequence generated by the middle end. [ On z196 and more recent machines, we already use LOCR to select one of two immediates (in registers), generated via the movmodecc expander; this can handle generic MODE_CC modes. ] But no matter the rationale, I still don't see what the original problem is; the pattern describes the mode it can support -- shouldn't this be enough for the middle end to know whether it is available, without requiring any further special-case checks? Bye, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com
Re: [PATCH x86] Increase PARAM_MAX_COMPLETELY_PEELED_INSNS when branch is costly
So are there any objections to enable this (PARAM_MAX_COMPLETELY_PEELED_INSNS increase from 100 to 120) for x86? On Fri, Oct 31, 2014 at 7:52 PM, Evgeny Stupachenko evstu...@gmail.com wrote: I've measured spec2000, spec2006 as well and EEMBC for Silvermont in addition. 100-120 change gives gain for Silvermont, the results on Haswell are flat. On Fri, Oct 31, 2014 at 3:14 PM, Eric Botcazou ebotca...@adacore.com wrote: Agreed, I think the value of 100 was set decade ago by Zdenek and me completely artifically. I do not recall any serious tuning of this flag. Are you talking bout PARAM_MAX_COMPLETELY_PEELED_INSNS here? If so, see: https://gcc.gnu.org/ml/gcc-patches/2012-11/msg01193.html We have experienced performance regressions because of this arbitrary change and bumped it back to 200 unconditionally. -- Eric Botcazou
[PATCH] Fix bswap regression: expand 8bit rotations of 16bit values into bswaphi patterns
Trunk revision 216971 introduced LROTATE_EXPR as the canonical representation for a byte swap of a 2 bytes value, as per [1]. However, backend expects bswaphi patterns for such operation as these operation are more specific than a rotation. This led to a number of testcases starting to fail such as gcc.target/arm/builtin-bswap16-1.c and gcc.target/aarch64/builtin-bswap-2.c (these were skipped with my configuration). This patch adds a check in expmed to expand such LROTATE_EXPR into bswaphi pattern. [1] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00616.html Note that this is unrelated to PR63761 (but I have diagnosed the root cause). ChangeLog entry is as follows: 2014-11-03 Thomas Preud'homme thomas.preudho...@arm.com * expmed.c (expand_shift_1): Expand 8 bit rotate of 16 bit value to bswaphi if available. diff --git a/gcc/expmed.c b/gcc/expmed.c index af14b99..7e86b59 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -2164,6 +2164,14 @@ expand_shift_1 (enum tree_code code, machine_mode mode, rtx shifted, code = left ? LROTATE_EXPR : RROTATE_EXPR; } + if (rotate + CONST_INT_P (op1) + INTVAL (op1) == BITS_PER_UNIT + GET_MODE_SIZE (scalar_mode) == 2 + optab_handler (bswap_optab, HImode) != CODE_FOR_nothing) +return expand_unop (HImode, bswap_optab, shifted, NULL_RTX, + unsignedp); + if (op1 == const0_rtx) return shifted; Is this ok for trunk? With the right configuration I could reproduce the test failure and with this patch both tests fail. Best regards, Thomas
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
On Thu, Nov 6, 2014 at 11:38 PM, Richard Henderson r...@redhat.com wrote: On 11/06/2014 06:45 PM, Ian Taylor wrote: On Thu, Nov 6, 2014 at 5:04 AM, Richard Henderson r...@redhat.com wrote: That said, this *may* not actually be a problem. It's not the direct (possibly lazy bound) call into libffi that needs a static chain, it's the indirect call that libffi produces. And the indirect calls that Go produces. I'm pretty sure that there are no dynamically linked Go calls that require the static chain. They're used for closures, which are either fully indirect from a different translation unit, or locally bound closures through which the optimizer has seen the construction, and optimized to a direct call. Ian, have I missed a case where a closure could wind up with a direct call to a lazy bound function? I think you've covered all the cases. The closure value is only required when calling a nested function. There is no way to refer directly to a nested function defined in a different shared library. The only way you can get such a reference is if some function in that shared library returns it. Sorry, I wasn't clear. I know nested functions must be local. I'm asking about Go closures, supposing we go ahead with the change to make them use the static chain register. I think we're saying the same thing. Closures exist only for nested functions and for functions created by reflect.MakeFunc and friends. Storing a top-level function into a variable will give you something that looks like it has a closure, but the closure will always be empty and it will never be used. The indirect call will set the closure value in the static chain register, but the register will not be used by the function being called. I'm merely pretty sure that calling a closure is either fully indirect or local direct. Yes. Certainly there are cases in the testsuite where -O3 is able to look through the creation of a closure and have a direct call to the function. Given that closures are custom created for the data at the creation site, it seems unlikely that the optimizer could look through that and come up with a dynamically bound function. Yes. Ian
Re: [gomp4] Fix libgomp-oacc.c/lib-66.c testcase
Hi Tom! On Fri, 7 Nov 2014 14:59:45 +0100, Tom de Vries tom_devr...@mentor.com wrote: On 04-11-14 23:46, Tom de Vries wrote: This patch fixes the libgomp-oacc.c/lib-66.c testcase. It allows the test to run for non-shared mem accelerators, and skips the test otherwise. 2014-11-03 Tom de Vries t...@codesourcery.com * testsuite/libgomp.oacc-c-c++-common/lib-66.c: Skip for shared memory accelerators. (main): Use acc_device_default instead of acc_device_nvidia. OK, thanks! One note: --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/lib-66.c @@ -1,4 +1,5 @@ /* { dg-do run } */ +/* { dg-skip-if { *-*-* } { * } { -DACC_MEM_SHARED=0 } } */ #include string.h #include stdlib.h @@ -12,7 +13,7 @@ main (int argc, char **argv) unsigned char *h; void *d; - acc_init (acc_device_nvidia); + acc_init (acc_device_default); It doesn't hurt, but initializing with acc_device_default is the default anyway. But, you can keep this, to make sure that we actually do support this. h = (unsigned char *) malloc (N); @@ -41,7 +42,7 @@ main (int argc, char **argv) free (h); - acc_shutdown (acc_device_nvidia); + acc_shutdown (acc_device_default); return 0; } Grüße, Thomas signature.asc Description: PGP signature
[committed, testcase] add aarch64 to target list to avoid XPASS
add aarch64 to the target list in gcc.dg/tree-ssa/20040204-1.c after this fix the XPASS (unexpected pass) gone away on aarch64. committed as obvious, 217228. gcc/testsuite/ * gcc.dg/tree-ssa/20040204-1.c: Add aarch64*-*-* to the target list. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c b/gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c index 8518dfb..2793336 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c @@ -33,5 +33,5 @@ void test55 (int x, int y) that the should be emitted (based on BRANCH_COST). Fix this by teaching dom to look through and register all components as true. */ -/* { dg-final { scan-tree-dump-times link_error 0 optimized { xfail { ! alpha*-*-* arm*-*-* powerpc*-*-* cris-*-* crisv32-*-* hppa*-*-* i?86-*-* mmix-*-* mips*-*-* m68k*-*-* moxie-*-* nds32*-*-* sparc*-*-* spu-*-* x86_64-*-* } } } } */ +/* { dg-final { scan-tree-dump-times link_error 0 optimized { xfail { ! alpha*-*-* arm*-*-* aarch64*-*-* powerpc*-*-* cris-*-* crisv32-*-* hppa*-*-* i?86-*-* mmix-*-* mips*-*-* m68k*-*-* moxie-*-* nds32*-*-* sparc*-*-* spu-*-* x86_64-*-* } } } } */ /* { dg-final { cleanup-tree-dump optimized } } */
Re: [PATCH, C++] Fix PR63366: __complex not equivalent to __complex double in C++
Applied, thanks. Jason
Re: [gofrontend-dev] [PATCH 4/4] Gccgo port to s390[x] -- part II
On Fri, Nov 7, 2014 at 12:51 AM, Dominik Vogt v...@linux.vnet.ibm.com wrote: On Thu, Nov 06, 2014 at 09:06:18AM -0800, Ian Taylor wrote: On Thu, Nov 6, 2014 at 4:04 AM, Dominik Vogt v...@linux.vnet.ibm.com wrote: On Tue, Nov 04, 2014 at 08:16:51PM -0800, Ian Taylor wrote: The way to do it is not by copying the test. If the test needs to be customized, add additional files that use // +build lines to pick which files is built. Move them into a directory, like method4.go or other tests that use rundir. Currently go-test.exp does not look at the build lines of the files in subdirectories. Before I add that to the gcc testsuite start adding that, is it certain that the golang testsuite will be able to understand that and compile only the requested files? Hmmm, that is a good point. The testsuite doesn't use the go command to build the files in subdirectories, so it won't honor the +build lines. I didn't think of that. Sorry for pointing you in the wrong direction. That's no problem, I can enhance go-test.exp in Gcc. The question is if test cases extended in such a way would run in the master Go repository too. Are the tests there run with the Go tool? I'm sorry, I wasn't clear. The test cases will not work in the master Go repository. When I said the testsuite doesn't use go command I was referring to the master testsuite. Sorry for the confusion. Ian
[committed, testcase] cleanup gnu11 for gcc.target/arm/lp1243022.c
one more gnu11 fix for gcc.target/arm/lp1243022.c (this test only run on arm-none-eabi/-mthumb, just noticed when I go through bare metal test) committed as obvious, 217230. gcc/testsuite/ * gcc.target/arm/lp1243022.c (xhci_test_trb_in_td): Add return type. (xhci_check_trb_in_td_math): Likewise. diff --git a/gcc/testsuite/gcc.target/arm/lp1243022.c b/gcc/testsuite/gcc.target/arm/lp1243022.c index cb40590..5f26994 100644 --- a/gcc/testsuite/gcc.target/arm/lp1243022.c +++ b/gcc/testsuite/gcc.target/arm/lp1243022.c @@ -47,6 +47,7 @@ dma_addr_t xhci_trb_virt_to_dma (struct xhci_segment * seg, union xhci_trb * trb); struct xhci_segment *trb_in_td (struct xhci_segment *start_seg, dma_addr_t suspect_dma); +int xhci_test_trb_in_td (struct xhci_hcd *xhci, struct xhci_segment *input_seg, union xhci_trb *start_trb, union xhci_trb *end_trb, dma_addr_t input_dma, struct xhci_segment *result_seg, @@ -64,6 +65,7 @@ xhci_test_trb_in_td (struct xhci_hcd *xhci, struct xhci_segment *input_seg, Expected seg %p, got seg %p\n, result_seg, seg); } } +int xhci_check_trb_in_td_math (struct xhci_hcd *xhci, gfp_t mem_flags) { struct
Re: [PATCH/AARCH64] Move the rest of the cost tables to aarch64-cost-tables.h
On Fri, Nov 7, 2014 at 7:08 AM, Richard Earnshaw rearn...@arm.com wrote: On 21/10/14 22:37, Andrew Pinski wrote: Hi, To make aarch64.c a little smaller and a little easier to understand, I have moved the rest of the cost tables (cpu_addrcost_table, cpu_regmove_cost, cpu_vector_cost) to aarch64-cost-tables. I also fixed up the inconstancy in the use of __extension__ on some of the structures and not all of them. I used a define to allow it easier instead of having to have #if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 each time. OK? Build and tested on aarch64-elf with no regressions. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64-cost-tables.h (NAMED_PARAM): New define. (NEED_EXTENSION): New define. (generic_addrcost_table): Moved from aarch64.c. (cortexa57_addrcost_table): Likewise. (generic_regmove_cost): Likewise. (cortexa57_regmove_cost): Likewise. (cortexa53_regmove_cost): Likewise. (thunderx_regmove_cost): Likewise. (generic_vector_cost): Likewise. (cortexa57_vector_cost): Likewise. * config/aarch64/aarch64.c (NAMED_PARAM): Delete, moved to aarch64-cost-tables.h. (generic_addrcost_table): Likewise. (cortexa57_addrcost_table): Likewise. (generic_regmove_cost): Likewise. (cortexa57_regmove_cost): Likewise. (cortexa53_regmove_cost): Likewise. (thunderx_regmove_cost): Likewise. (generic_vector_cost): Likewise. (cortexa57_vector_cost): Likewise. (generic_tunings): Use NEED_EXTENSION. (cortexa53_tunings): Likewise. (cortexa57_tunings): Likewise. (thunderx_tunings): Likewise. I don't particularly like the idea of having real data and code in header files. Can't this be moved into aarch64-cost-tables.c? Yes it should be able to. I will test to do that. I was just following what was done for the common (between arm and aarch64) cost tables already. Thanks, Andrew Pinski R. movetablestoaarch64-cost.diff.txt Index: config/aarch64/aarch64-cost-tables.h === --- config/aarch64/aarch64-cost-tables.h (revision 216524) +++ config/aarch64/aarch64-cost-tables.h (working copy) @@ -125,7 +125,135 @@ const struct cpu_cost_table thunderx_ext } }; +#if HAVE_DESIGNATED_INITIALIZERS +#define NAMED_PARAM(NAME, VAL) .NAME = (VAL) +#else +#define NAMED_PARAM(NAME, VAL) (VAL) +#endif + +#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 +#define NEED_EXTENSION __extension__ +#else +#define NEED_EXTENSION +#endif + + +/* The address cost models. */ +NEED_EXTENSION +static const struct cpu_addrcost_table generic_addrcost_table = +{ +#if HAVE_DESIGNATED_INITIALIZERS + .addr_scale_costs = +#endif +{ + NAMED_PARAM (hi, 0), + NAMED_PARAM (si, 0), + NAMED_PARAM (di, 0), + NAMED_PARAM (ti, 0), +}, + NAMED_PARAM (pre_modify, 0), + NAMED_PARAM (post_modify, 0), + NAMED_PARAM (register_offset, 0), + NAMED_PARAM (register_extend, 0), + NAMED_PARAM (imm_offset, 0) +}; + +NEED_EXTENSION +static const struct cpu_addrcost_table cortexa57_addrcost_table = +{ +#if HAVE_DESIGNATED_INITIALIZERS + .addr_scale_costs = +#endif +{ + NAMED_PARAM (hi, 1), + NAMED_PARAM (si, 0), + NAMED_PARAM (di, 0), + NAMED_PARAM (ti, 1), +}, + NAMED_PARAM (pre_modify, 0), + NAMED_PARAM (post_modify, 0), + NAMED_PARAM (register_offset, 0), + NAMED_PARAM (register_extend, 0), + NAMED_PARAM (imm_offset, 0), +}; + + +/* Register to Register move costs */ +NEED_EXTENSION +static const struct cpu_regmove_cost generic_regmove_cost = +{ + NAMED_PARAM (GP2GP, 1), + NAMED_PARAM (GP2FP, 2), + NAMED_PARAM (FP2GP, 2), + NAMED_PARAM (FP2FP, 2) +}; + +NEED_EXTENSION +static const struct cpu_regmove_cost cortexa57_regmove_cost = +{ + NAMED_PARAM (GP2GP, 1), + /* Avoid the use of slow int-fp moves for spilling by setting + their cost higher than memmov_cost. */ + NAMED_PARAM (GP2FP, 5), + NAMED_PARAM (FP2GP, 5), + NAMED_PARAM (FP2FP, 2) +}; + +NEED_EXTENSION +static const struct cpu_regmove_cost cortexa53_regmove_cost = +{ + NAMED_PARAM (GP2GP, 1), + /* Avoid the use of slow int-fp moves for spilling by setting + their cost higher than memmov_cost. */ + NAMED_PARAM (GP2FP, 5), + NAMED_PARAM (FP2GP, 5), + NAMED_PARAM (FP2FP, 2) +}; + +NEED_EXTENSION +static const struct cpu_regmove_cost thunderx_regmove_cost = +{ + NAMED_PARAM (GP2GP, 2), + NAMED_PARAM (GP2FP, 2), + NAMED_PARAM (FP2GP, 6), + NAMED_PARAM (FP2FP, 4) +}; + +/* Vector instruction cost model */ +NEED_EXTENSION +static const struct cpu_vector_cost generic_vector_cost = +{ + NAMED_PARAM (scalar_stmt_cost, 1), + NAMED_PARAM (scalar_load_cost, 1), + NAMED_PARAM (scalar_store_cost, 1), + NAMED_PARAM (vec_stmt_cost, 1), + NAMED_PARAM (vec_to_scalar_cost, 1), + NAMED_PARAM (scalar_to_vec_cost, 1), + NAMED_PARAM (vec_align_load_cost, 1), + NAMED_PARAM
RE: [PATCH, C++] Fix PR63366: __complex not equivalent to __complex double in C++
Hi Jason, thanks for commiting this change. Note that the following ChangeLog entry is missing. If you want me to commit it let me know. *** testsuite/ChangeLog *** 2014-11-03 Thomas Preud'homme thomas.preudho...@arm.com PR C++/63366 * g++.dg/torture/pr63366.C: New test. Best regards, Thomas -Original Message- From: Jason Merrill [mailto:ja...@redhat.com] Sent: Friday, November 07, 2014 4:23 PM To: Thomas Preud'homme; 'Nathan Sidwell'; gcc-patches@gcc.gnu.org Subject: Re: [PATCH, C++] Fix PR63366: __complex not equivalent to __complex double in C++ Applied, thanks. Jason
Trivial testsuite fix
I added a C++ test with a .c extension. Oops. Corrected in the obvious way :-) Committed to the trunk. * g++.dg/pr61289-2.C: Renamed from pr61289-2.c. diff --git a/gcc/testsuite/g++.dg/pr61289-2.C b/gcc/testsuite/g++.dg/pr61289-2.C new file mode 100644 index 000..4cc3ebe --- /dev/null +++ b/gcc/testsuite/g++.dg/pr61289-2.C @@ -0,0 +1,62 @@ +/* { dg-do run } */ +/* { dg-options -O2 -fno-exceptions } */ +struct S +{ + inline int fn1 () const { return s; } + __attribute__ ((noinline, noclone)) S *fn2 (int); + __attribute__ ((noinline, noclone)) void fn3 (); + __attribute__ ((noinline, noclone)) static S *fn4 (int); + S (int i) : s (i) {} + int s; +}; + +int a = 0; +S *b = 0; + +S * +S::fn2 (int i) +{ + a++; + if (a == 1) +return b; + if (a 3) +__builtin_abort (); + b = this; + return new S (i + s); +} + +S * +S::fn4 (int i) +{ + b = new S (i); + return b; +} + +void +S::fn3 () +{ + delete this; +} + +void +foo () +{ + S *c = S::fn4 (20); + for (int i = 0; i 2;) +{ + S *d = c-fn2 (c-fn1 () + 10); + if (c != d) +{ + c-fn3 (); + c = d; + ++i; +} +} + c-fn3 (); +} + +int +main () +{ + foo (); +} diff --git a/gcc/testsuite/g++.dg/pr61289-2.c b/gcc/testsuite/g++.dg/pr61289-2.c deleted file mode 100644 index 4cc3ebe..000 --- a/gcc/testsuite/g++.dg/pr61289-2.c +++ /dev/null @@ -1,62 +0,0 @@ -/* { dg-do run } */ -/* { dg-options -O2 -fno-exceptions } */ -struct S -{ - inline int fn1 () const { return s; } - __attribute__ ((noinline, noclone)) S *fn2 (int); - __attribute__ ((noinline, noclone)) void fn3 (); - __attribute__ ((noinline, noclone)) static S *fn4 (int); - S (int i) : s (i) {} - int s; -}; - -int a = 0; -S *b = 0; - -S * -S::fn2 (int i) -{ - a++; - if (a == 1) -return b; - if (a 3) -__builtin_abort (); - b = this; - return new S (i + s); -} - -S * -S::fn4 (int i) -{ - b = new S (i); - return b; -} - -void -S::fn3 () -{ - delete this; -} - -void -foo () -{ - S *c = S::fn4 (20); - for (int i = 0; i 2;) -{ - S *d = c-fn2 (c-fn1 () + 10); - if (c != d) -{ - c-fn3 (); - c = d; - ++i; -} -} - c-fn3 (); -} - -int -main () -{ - foo (); -}
[PATCH][Revisedx2] Fix PR63750
The attached revised patch eliminates the compilation error... error: use of undeclared identifier 'do_not_use_toupper_with_safe_ctype' on x86_64-apple-darwin14 when bootstrapping using the Clang 6.0 compiler by moving the include for strings earlier. Okay for gcc trunk? Jack PR63750_v3.patch Description: Binary data
[PATCH]Revised] fix PR63699
The attached patch eliminates the compilation error... error: use of undeclared identifier 'do_not_use_toupper_with_safe_ctype' on x86_64-apple-darwin14 when bootstrapping using the Clang 6.0 compiler by moving the include of string earlier. Okay for gcc trunk? Jack PR63699_v2.patch Description: Binary data
[PR c/52952] More precise locations within format strings
This patch allows format warnings to point within the format string for simple strings. There are a few limitations: * It does not handle 'const char *' because the location of the initializer is not available. The result is the same before and after this patch. * It does not handle non-concatenated tokens, since the preprocessor does not keep their location yet. The result after the patch is that we point to some arbitrary place between the first and the first newline. This is slightly worse than the current behavior (which points to the first ), but I could not figure out a way to detect this case and not generate an offset. It is a matter of implementing this idea: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52952#c13 as a follow-up. The line pointed at is the same before and after the patch, only the column number is affected. * It does not handle macros, but the behavior before and after this patch is the same (and there is work-in-progress on this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52952#c33) The changes to libcpp are exactly the same as in this patch: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg0.html, which is still pending review. Boottested on x86_64-linux-gnu. OK? libcpp/ChangeLog: 2014-11-07 Manuel López-Ibáñez m...@gcc.gnu.org * include/line-map.h (linemap_position_for_loc_and_offset): Declare. * line-map.c (linemap_position_for_loc_and_offset): New. gcc/c-family/ChangeLog: 2014-11-07 Manuel López-Ibáñez m...@gcc.gnu.org PR c/52952 * c-format.c (location_from_offset): New. (check_format_info): Use it. (check_format_arg): Likewise. (check_format_info_main): Likewise. (format_type_warning): Likewise. gcc/testsuite/ChangeLog: 2014-11-07 Manuel López-Ibáñez m...@gcc.gnu.org PR c/52952 * gcc.dg/redecl-4.c: Update column info. * gcc.dg/format/bitfld-1.c: Likewise. * gcc.dg/format/attr-2.c: Likewise. * gcc.dg/format/attr-6.c: Likewise. * gcc.dg/format/attr-7.c: Likewise. * gcc.dg/format/asm_fprintf-1.c: Likewise. * gcc.dg/format/attr-4.c: Likewise. * gcc.dg/format/branch-1.c: Likewise. * gcc.dg/format/c90-printf-1.c: Likewise. Index: gcc/c-family/c-format.c === --- gcc/c-family/c-format.c (revision 217191) +++ gcc/c-family/c-format.c (working copy) @@ -66,10 +66,32 @@ static bool cmp_attribs (const char *tat static int first_target_format_type; static const char *format_name (int format_num); static int format_flags (int format_num); +/* FIXME: This indicates that loc is not the location of the format + string, thus computing an offset is useless. This happens, for + example, when the format string is a constant array. + Unfortunately, GCC does not keep track of the location of the + initializer of the array yet. */ +static bool offset_is_invalid; + +/* Return a location that encodes the same location as LOC but shifted + by OFFSET columns. */ + +static location_t +location_from_offset (location_t loc, int offset) +{ + gcc_checking_assert (offset = 0); + if (offset_is_invalid + || linemap_location_from_macro_expansion_p (line_table, loc) + || offset 0) +return loc; + return linemap_position_for_loc_and_offset (line_table, + loc, (unsigned int) offset); +} + /* Check that we have a pointer to a string suitable for use as a format. The default is to check for a char type. For objective-c dialects, this is extended to include references to string objects validated by objc_string_ref_type_p (). Targets may also provide a string object type that can be used within c and @@ -378,10 +400,13 @@ typedef struct format_wanted_type int format_length; /* The actual parameter to check against the wanted type. */ tree param; /* The argument number of that parameter. */ int arg_num; + /* The offset location of this argument with respect to the format + string location. */ + unsigned int offset_loc; /* The next type to check for this format conversion, or NULL if none. */ struct format_wanted_type *next; } format_wanted_type; /* Convenience macro for format_length_info meaning unused. */ @@ -1346,12 +1371,10 @@ check_format_info (function_format_info check_function_arguments_recurse (check_format_arg, format_ctx, format_tree, arg_num); location_t loc = format_ctx.res-format_string_loc; - if (res.extra_arg_loc == UNKNOWN_LOCATION) -res.extra_arg_loc = loc; if (res.number_non_literal 0) { /* Functions taking a va_list normally pass a non-literal format string. These functions typically are declared with @@ -1393,12 +1416,16 @@ check_format_info (function_format_info arguments, but was otherwise OK (either non-literal or checked OK). If the format is an empty string, this should be counted similarly to the
Re: [PR c/52952] More precise locations within format strings
On Fri, 7 Nov 2014, Manuel López-Ibáñez wrote: This patch allows format warnings to point within the format string for simple strings. There are a few limitations: * It does not handle 'const char *' because the location of the initializer is not available. The result is the same before and after this patch. * It does not handle non-concatenated tokens, since the preprocessor does not keep their location yet. The result after the patch is that we point to some arbitrary place between the first and the first newline. This is slightly worse than the current behavior (which points to the first ), but I could not figure out a way to detect this case and not generate an offset. It is a matter of implementing this idea: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52952#c13 as a follow-up. The line pointed at is the same before and after the patch, only the column number is affected. * It does not handle macros, but the behavior before and after this patch is the same (and there is work-in-progress on this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52952#c33) Does it also not handle escape sequences in the strings? You appear to compute offsets in terms of the number of bytes after the start of the string, then pass this to functions commented to take an offset in columns, and escape sequences before the point where the warning is being given are another case where those would differ. Anyway, the front-end changes are OK with appropriate comments that make clear what the interface is at each point and where the (missing) conversion from a byte offset within a string to a column offset in the source file should take place, though someone else ought to review the line-map changes. -- Joseph S. Myers jos...@codesourcery.com
[patch] Fix handling of inlining and nested functions in dwarf2out.c
Hi, the mix of inlining and nested functions is an interesting challenge on the debug info side because it generates cycles in the debug info: if a child calls its parent and the parent is inlined but not the child, you have the (non-abstract) instance of the child nested in the abstract instance of the parent and containing a concrete inline instance of the parent which points back to the abstract instance, as per the DWARF spec. This is what GCC has been generating for a while, although this caused GDB to crash until very recently (only GDB 7.7 and later versions are OK). Under very special circumstances[*], you can even have a cycle in the internal representation of dwarf2out.c leading to an ICE because the DIEs in the cycle have no ultimate parent. This happens if you have, in addition to the above setup, a grandparent which is both inlined in one of its callers and output as a standalone function: when gen_decl_die is invoked on the grandparent, the cgraph_function_possibly_inlined_p predicate is true so set_decl_origin_self is invoked on the grandparent; now set_decl_origin_self recurses down the entire nest of functions, so the child is also marked as originating from itself by the recursion. Later, when gen_decl_die is invoked on the child, the cgraph_function_possibly_inlined_p predicate is false for it so dwarf2out_abstract_function is not invoked before gen_subprogram_die, which results in a DIE with DW_AT_abstract_origin pointing to itself and without parent for the child. Moreover, when gen_subprogram_die is invoked on the concrete inline instance of the parent, this DIE is retrieved and /* Fixup die_parent for the abstract instance of a nested inline function. */ if (old_die old_die-die_parent == NULL) add_child_die (context_die, old_die); attaches it to the context_die, creating the cycle since we are in the child. I think that the source of the problem is the discrepancy between the cgraph_function_possibly_inlined_p predicate, which doesn't consider the inlining status of the child as being related to that of its parents (which is explicitly allowed by the DWARF spec) and the set_decl_origin_self recursion, which runs down the entire nest. Hence the attached patchlet, which is sufficient to get rid of the cycle and, therefore, of the ICE. Bootstrapped and regtested on x86_64-suse-linux, OK for the mainline? 2014-11-07 Eric Botcazou ebotca...@adacore.com * dwarf2out.c (set_block_origin_self): Skip nested functions. [*] There is an important factor coming into play, which is the order in which the functions are sent to dwarf2out.c. That's decided by the cgraph machinery and the ICE happens only when the grandparent is sent before the child. Now, while it's very easy to have this situation without inlining, it's very hard when the functions start being inlined because the cgraph machinery sends them after their callers. We have a large testcase of several big Ada units for which the combination of the various IPA transformations (inlining, cloning, etc) and the repeated topological sorts on the callgraph lead to the ICE, but any attempt at reducing makes it disappear. -- Eric BotcazouIndex: dwarf2out.c === --- dwarf2out.c (revision 217148) +++ dwarf2out.c (working copy) @@ -17919,8 +17919,11 @@ set_block_origin_self (tree stmt) for (local_decl = BLOCK_VARS (stmt); local_decl != NULL_TREE; local_decl = DECL_CHAIN (local_decl)) - if (! DECL_EXTERNAL (local_decl)) - set_decl_origin_self (local_decl); /* Potential recursion. */ + /* Do not recurse on nested functions since the inlining status + of parent and child can be different as per the DWARF spec. */ + if (TREE_CODE (local_decl) != FUNCTION_DECL + !DECL_EXTERNAL (local_decl)) + set_decl_origin_self (local_decl); } {
Re: [PR c/52952] More precise locations within format strings
On 7 November 2014 18:57, Joseph Myers jos...@codesourcery.com wrote: On Fri, 7 Nov 2014, Manuel López-Ibáñez wrote: This patch allows format warnings to point within the format string for simple strings. There are a few limitations: * It does not handle 'const char *' because the location of the initializer is not available. The result is the same before and after this patch. * It does not handle non-concatenated tokens, since the preprocessor does not keep their location yet. The result after the patch is that we point to some arbitrary place between the first and the first newline. This is slightly worse than the current behavior (which points to the first ), but I could not figure out a way to detect this case and not generate an offset. It is a matter of implementing this idea: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52952#c13 as a follow-up. The line pointed at is the same before and after the patch, only the column number is affected. * It does not handle macros, but the behavior before and after this patch is the same (and there is work-in-progress on this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52952#c33) Does it also not handle escape sequences in the strings? You appear to compute offsets in terms of the number of bytes after the start of the string, then pass this to functions commented to take an offset in columns, and escape sequences before the point where the warning is being given are another case where those would differ. Ops, no. For trivial escape sequences I could just count +1 when a character that results from an escape sequence is found. But without the original string around, I cannot distinguish between \n and \x012. Maybe I can open the file and re-parse the string to find the right column. Of course, this will not work when reading stdin, but in that case the behavior will be the same as currently. It will also allow me to gracefully degrade in the case of non-concatenated tokens. Do you see any other alternative? Anyway, the front-end changes are OK with appropriate comments that make clear what the interface is at each point and where the (missing) conversion from a byte offset within a string to a column offset in the source file should take place, though someone else ought to review the line-map changes. Thanks for the review. I will try first to fix the escape sequences case. Nonetheless, the line-map changes are useful for the other patch (the Fortran parts are approved). There are 3 entries for libcpp in MAINTAINERS: libcpp Per Bothner p...@bothner.com libcpp All C and C++ front end maintainers libcpp Tom Tromey tro...@redhat.com Neither Per nor Tom are active in GCC anymore. If the FE maintainers do not feel comfortable reviewing line-map changes, could you nominate Dodji as line-map maintainer if he is willing to accept it? I think he is currently the person that understands that code best. Cheers, Manuel.
Re: [PATCH 10/27] New file: gcc/jit/libgccjit.c
On 11/05/14 12:34, David Malcolm wrote: I've added comments throughout the file. I didn't bother adding __attribute__((cold)), instead simply dropping that TODO. Fine. Attached is the current state of the file gcc/jit/libgccjit.c (on the branch) for review. OK for trunk? (conditional on all the rest being approved, and usual bootstrapregrtesting; I've merely verified a non-bootstrap compile and successful make check-jit so far). There were a few other changes relative to what you've approved, which I'll post for review shortly. Dave libgccjit.c /* Implementation of the C API; all wrappers into the internal C++ API Copyright (C) 2013-2014 Free Software Foundation, Inc. Contributed by David Malcolmdmalc...@redhat.com. This is fine. With the comments, it became a lot clearer this was just the error checking wrappers and not a whole lot else. The one thing this does make me wonder is should we add something about the error checking may change in significant ways from one release to the next, much like the ABI/API. This seems important as the error checking in many ways specifies the language for the JIT and I suspect we haven't got all the corner cases sorted out yet (and probably can't until this gets into wider distribution). jeff
Re: [PATCH] AIX: Filename-based shared library versioning for libgcc_s
First, please explicitly copy me on AIX or PowerPC patches sent to gcc-patches. I don't have a fundamental objection to including this option, but note that Richi, Honza and I have discovered that using AIX runtime linking option interacts badly with some GCC optimizations and can result in applications that hang in a loop. All code on AIX is position independent (PIC) by default. Executables and shared libraries essentially are PIE. Because of this, AIX does not provide separate static libraries and one can link statically with a shared library. Creating a library enabled for runtime linking with -G (-brtl), causes a lot of problems, including a newly recognized failure mode. Without careful control over AIX symbol export, all global calls with use glink code (equivalent to ELF PLTs). This also creates a TOC entry for every global call, possibly overflowing the TOC. But the main problem is GCC uses aliases and functions declared as weak to support some C++ features. Functions declared weak interact badly with shared libraries compiled for AIX runtime linking and linked statically. This can result in the static binary binding with the glink code that loads its own address from the TOC instead of the target function, causing endless looping. Honza made some changes to GCC code generation for AIX, but there still are problems and I have disabled building libstdc++ enabled for runtime linking. libgcc always explicitly creates a static library and uses it for static linking. All shared libraries for AIX that use this scheme (through libtool) would have to follow the same convention to create both shared and static libraries. This new option only makes sense if it fully emulates SVR4/ELF behavior and always creates both shared .so and static .a libraries. Thanks, David
Re: [PATCH] Fix bswap regression: expand 8bit rotations of 16bit values into bswaphi patterns
On 11/07/14 08:58, Thomas Preud'homme wrote: Trunk revision 216971 introduced LROTATE_EXPR as the canonical representation for a byte swap of a 2 bytes value, as per [1]. However, backend expects bswaphi patterns for such operation as these operation are more specific than a rotation. This led to a number of testcases starting to fail such as gcc.target/arm/builtin-bswap16-1.c and gcc.target/aarch64/builtin-bswap-2.c (these were skipped with my configuration). This patch adds a check in expmed to expand such LROTATE_EXPR into bswaphi pattern. [1] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00616.html Note that this is unrelated to PR63761 (but I have diagnosed the root cause). ChangeLog entry is as follows: 2014-11-03 Thomas Preud'homme thomas.preudho...@arm.com * expmed.c (expand_shift_1): Expand 8 bit rotate of 16 bit value to bswaphi if available. Why restrict this to 8 bit rotate of a 16 bit value? Shouldn't it apply to a 16 bit rotate of a 32 bit value, or 32 bit rotate of 64 bit value? Jeff
Re: [PATCH] Fix bswap regression: expand 8bit rotations of 16bit values into bswaphi patterns
On Fri, Nov 07, 2014 at 12:54:44PM -0700, Jeff Law wrote: On 11/07/14 08:58, Thomas Preud'homme wrote: Trunk revision 216971 introduced LROTATE_EXPR as the canonical representation for a byte swap of a 2 bytes value, as per [1]. However, backend expects bswaphi patterns for such operation as these operation are more specific than a rotation. This led to a number of testcases starting to fail such as gcc.target/arm/builtin-bswap16-1.c and gcc.target/aarch64/builtin-bswap-2.c (these were skipped with my configuration). This patch adds a check in expmed to expand such LROTATE_EXPR into bswaphi pattern. [1] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00616.html Note that this is unrelated to PR63761 (but I have diagnosed the root cause). ChangeLog entry is as follows: 2014-11-03 Thomas Preud'homme thomas.preudho...@arm.com * expmed.c (expand_shift_1): Expand 8 bit rotate of 16 bit value to bswaphi if available. Why restrict this to 8 bit rotate of a 16 bit value? Shouldn't it apply to a 16 bit rotate of a 32 bit value, or 32 bit rotate of 64 bit value? That isn't a byteswap, but halfword swap or wordswap. 32 bit byteswap reverses 0x01020304 byte ordering into 0x04030201, while rotate 16 is 0x03040102. Jakub
Re: [patch sdbout]: Fix ICE on -debug testsuite test const2.C for coff
On 11/06/14 12:37, Kai Tietz wrote: Hi, This fixes recent fallout of debug-tests on Windows target for sdbout (coff) caused by an ICE. ChangeLog 2014-11-06 Kai Tietz kti...@redhat.com * sdbout.c (sdbout_symbol): Eliminate register only if decl isn't a global variable. Is there a testcase in the suite that triggers this problem? If not, can you try to add one?Out of curiosity, what was DECL_RTL here? This is probably OK, but I'd really like to know what kind of goofy RTL we passed to eliminate_regs that caused it to fail. Jeff
Re: [PARCH 1/2, x86, PR63534] Fix darwin bootstrap
On 11/05/14 04:54, Eric Botcazou wrote: Now if your argument is that IRA/LRA handle this, that's fine, a pointer to that code would be appreciated so that it can be quickly audited. Certainly the old local-alloc/global-alloc had magic for setjmp/longjmp and maybe IRA/LRA does too, but it's better to be sure than just assume. See ira-lives.c:1217 and below. Thanks. So that code creates a set of conflicts which, if I'm reading correctly, will prevent the PIC value from living in a register at all. Which ought to result in it being dumped into the stack and being reloaded for each use. Which ought to be safe (modulo the liveness bug Vlad is working on right now). Does that sound right to either of you? jeff
RE: [PATCH] Fix bswap regression: expand 8bit rotations of 16bit values into bswaphi patterns
From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Friday, November 07, 2014 8:01 PM Why restrict this to 8 bit rotate of a 16 bit value? Shouldn't it apply to a 16 bit rotate of a 32 bit value, or 32 bit rotate of 64 bit value? That isn't a byteswap, but halfword swap or wordswap. 32 bit byteswap reverses 0x01020304 byte ordering into 0x04030201, while rotate 16 is 0x03040102. If this patch gets approved as is I'll add a comment to explain this as this is the third time someone ask me this. Best regards, Thomas
Re: [PARCH 1/2, x86, PR63534] Fix darwin bootstrap
On 11/05/14 05:59, Evgeny Stupachenko wrote: On Tue, Nov 4, 2014 at 1:40 AM, Jeff Law l...@redhat.com wrote: On 11/01/14 06:39, Evgeny Stupachenko wrote: When PIC register is pseudo there is nothing special about it's value that setjmp can hurt. So if the pseudo register lives across setjmp_receiver RA should care about correct allocation (in case it is not saved/restored, it should go on stack). gcc.dg tests and specs I've tested behave like this. If the allocator picked a call-clobbered register for the PIC register, then we're obviously OK since the setjmp has to be expected to clobber the PIC register. But if the PIC register is in a call-saved register, then it's going to be assumed to not be clobbered across calls and I don't believe that is guaranteed for builtin setjmp/longjmp. Those restore SP, FP and an ARGP, but not anything else by default. I still don't see what is special for PIC register here. PIC pseudo now behave as every other pseudo register. If we assume that setjmp can change a pseudo register value we need IRA/LRA magic for each pseudo register. And that's precisely what we have as Eric pointed out. Any allocnos that are live across calls are not allocated into hard registers if we call setjmp or can receive a nonlocal goto. I believe that when we had EBX fixed, IRA/LRA don't save/restore it anywhere. Therefore we had to care about EBX value in special cases like setjmp/non-local goto. Now RA cares about PIC pseudo as well as about correct allocation for any pseudo register. Right. But what was missing was the explanation why this change is correct. With the knowledge about how IRA handles objects under these circumstances that Eric pointed to, it becomes much easier to see that the special handling is no longer desirable. ? You mean it emits a reference to the pseudo into RTL? That would indicate that the allocators never put the pseudo into a hard register?!? RTL dumps with a few pointers to key insns would help here. Correct, that is why Darwin crashes with ICE on non-local goto. We still have: I was referring to the generated RTL to confirm what I thought you stated, namely that we ended up with a reference to the pseudo being emitted into the insn stream after reload. The only way I could see that happening would be if the pseudo wasn't allocated a hard register at all. Which we now know makes sense after Eric pointed us the magic in ira-lives.c. In the end it all comes down to what is the behaviour of the allocator for a value that is live across calls in these kinds of functions. I think I've got enough background to properly review now ;-) Jeff
Re: [PATCH] Fix bswap regression: expand 8bit rotations of 16bit values into bswaphi patterns
On 11/07/14 13:11, Thomas Preud'homme wrote: From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Friday, November 07, 2014 8:01 PM Why restrict this to 8 bit rotate of a 16 bit value? Shouldn't it apply to a 16 bit rotate of a 32 bit value, or 32 bit rotate of 64 bit value? That isn't a byteswap, but halfword swap or wordswap. 32 bit byteswap reverses 0x01020304 byte ordering into 0x04030201, while rotate 16 is 0x03040102. If this patch gets approved as is I'll add a comment to explain this as this is the third time someone ask me this. Then I don't feel like such a bloody idiot after reading Jakub's reply :-) Jeff
Re: [PARCH 1/2, x86, PR63534] Fix darwin bootstrap
On 11/06/14 06:01, Evgeny Stupachenko wrote: Now I see that equiv reload could be special for PIC register. Let's apply more conservative patch. Darwin bootstrap passed with the patch applied on r216304 (along with already committed to trunk patches from PR63618 and PR63620). 2014-11-06 Evgeny Stupachenko evstu...@gmail.com PR target/63534 * config/i386/i386.c (builtin_setjmp_receiver): Use pic_offset_table_rtx for PIC register. (nonlocal_goto_receiver): Delete. OK for the trunk. One more Darwin bug gets squashed :-) jeff
Re: [PATCH 10/27] New file: gcc/jit/libgccjit.c
On Fri, 2014-11-07 at 12:47 -0700, Jeff Law wrote: On 11/05/14 12:34, David Malcolm wrote: I've added comments throughout the file. I didn't bother adding __attribute__((cold)), instead simply dropping that TODO. Fine. Attached is the current state of the file gcc/jit/libgccjit.c (on the branch) for review. OK for trunk? (conditional on all the rest being approved, and usual bootstrapregrtesting; I've merely verified a non-bootstrap compile and successful make check-jit so far). There were a few other changes relative to what you've approved, which I'll post for review shortly. Dave libgccjit.c /* Implementation of the C API; all wrappers into the internal C++ API Copyright (C) 2013-2014 Free Software Foundation, Inc. Contributed by David Malcolmdmalc...@redhat.com. This is fine. With the comments, it became a lot clearer this was just the error checking wrappers and not a whole lot else. Thanks. That was the last of the review requests, so I believe the jit branch is now approved for merger. I plan to do this early next week, to give time to rebase against latest trunk and retest it. The one thing this does make me wonder is should we add something about the error checking may change in significant ways from one release to the next, much like the ABI/API. This seems important as the error checking in many ways specifies the language for the JIT and I suspect we haven't got all the corner cases sorted out yet (and probably can't until this gets into wider distribution). I agree that the JIT language is specified by the runtime error-checking behaviour, but it's also specified by the types in the API. We're in slightly better shape here than, say gcc's internal tree API, in that the types in the API make a rigid separation between types vs expressions, and it also captures some lvalue vs rvalue distinctions, so client code is likely to not compile if it gets those things wrong. I've also tried to name the params in such a way as to hint at restrictions, e.g. gcc_jit_context_zero's second param is: gcc_jit_type *numeric_type i.e. the numeric_type name of that param describes a requirement. There are probably some under-specified behaviors in the JIT language as it stands - perhaps in ordering of operations? In any case, the current disclaimer reads: Note that libgccjit is currently of “Alpha” quality; the APIs are not yet set in stone, and they shouldn’t be used in production yet. I'm sure we'll want to reword that at some point. One big potential change might be to create a stable *plugin* API, unified with the JIT API (so we'd allow client code to use the same API for both embedding GCC and being embedded within GCC). Most of the gcc_jit_* symbols would become just gcc_*. I've experimented with that, but I don't see myself getting it done in time for the close of stage1 close (am frantically trying to finish the gimple-classes work), so I'm thinking of that unified jit-and-plugin API as a GCC 6 feature. Dave
Re: [PARCH 1/2, x86, PR63534] Fix darwin bootstrap
On Thu, Nov 6, 2014 at 5:01 AM, Evgeny Stupachenko evstu...@gmail.com wrote: Now I see that equiv reload could be special for PIC register. Let's apply more conservative patch. Darwin bootstrap passed with the patch applied on r216304 (along with already committed to trunk patches from PR63618 and PR63620). 2014-11-06 Evgeny Stupachenko evstu...@gmail.com PR target/63534 * config/i386/i386.c (builtin_setjmp_receiver): Use pic_offset_table_rtx for PIC register. (nonlocal_goto_receiver): Delete. It should be config/i386/i386.md, not config/i386/i386.c. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 7b1dd79..0df66ea 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -16989,10 +16989,9 @@ if (TARGET_MACHO) { rtx xops[3]; - rtx picreg = gen_rtx_REG (Pmode, PIC_OFFSET_TABLE_REGNUM); rtx_code_label *label_rtx = gen_label_rtx (); emit_insn (gen_set_got_labelled (pic_offset_table_rtx, label_rtx)); - xops[0] = xops[1] = picreg; + xops[0] = xops[1] = pic_offset_table_rtx; xops[2] = machopic_gen_offset (gen_rtx_LABEL_REF (SImode, label_rtx)); ix86_expand_binary_operator (MINUS, SImode, xops); } @@ -17002,36 +17001,6 @@ DONE; }) -(define_insn_and_split nonlocal_goto_receiver - [(unspec_volatile [(const_int 0)] UNSPECV_NLGR)] - TARGET_MACHO !TARGET_64BIT flag_pic - # - reload_completed - [(const_int 0)] -{ - if (crtl-uses_pic_offset_table) -{ - rtx xops[3]; - rtx label_rtx = gen_label_rtx (); - rtx tmp; - - /* Get a new pic base. */ - emit_insn (gen_set_got_labelled (pic_offset_table_rtx, label_rtx)); - /* Correct this with the offset from the new to the old. */ - xops[0] = xops[1] = pic_offset_table_rtx; - label_rtx = gen_rtx_LABEL_REF (SImode, label_rtx); - tmp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, label_rtx), - UNSPEC_MACHOPIC_OFFSET); - xops[2] = gen_rtx_CONST (Pmode, tmp); - ix86_expand_binary_operator (MINUS, SImode, xops); -} - else -/* No pic reg restore needed. */ -emit_note (NOTE_INSN_DELETED); - - DONE; -}) - ;; Avoid redundant prefixes by splitting HImode arithmetic to SImode. ;; Do not split instructions with mask registers. (define_split On Wed, Nov 5, 2014 at 3:59 PM, Evgeny Stupachenko evstu...@gmail.com wrote: On Tue, Nov 4, 2014 at 1:40 AM, Jeff Law l...@redhat.com wrote: On 11/01/14 06:39, Evgeny Stupachenko wrote: When PIC register is pseudo there is nothing special about it's value that setjmp can hurt. So if the pseudo register lives across setjmp_receiver RA should care about correct allocation (in case it is not saved/restored, it should go on stack). gcc.dg tests and specs I've tested behave like this. If the allocator picked a call-clobbered register for the PIC register, then we're obviously OK since the setjmp has to be expected to clobber the PIC register. But if the PIC register is in a call-saved register, then it's going to be assumed to not be clobbered across calls and I don't believe that is guaranteed for builtin setjmp/longjmp. Those restore SP, FP and an ARGP, but not anything else by default. I still don't see what is special for PIC register here. PIC pseudo now behave as every other pseudo register. If we assume that setjmp can change a pseudo register value we need IRA/LRA magic for each pseudo register. I believe that when we had EBX fixed, IRA/LRA don't save/restore it anywhere. Therefore we had to care about EBX value in special cases like setjmp/non-local goto. Now RA cares about PIC pseudo as well as about correct allocation for any pseudo register. So the callee might have clobbered the call saved hard register, expecting to restore its value in its epilogue. But due to the longjmp, that epilogue never gets called and thus the call-saved register won't have the right value in the receiver. Now if your argument is that IRA/LRA handle this, that's fine, a pointer to that code would be appreciated so that it can be quickly audited. Certainly the old local-alloc/global-alloc had magic for setjmp/longjmp and maybe IRA/LRA does too, but it's better to be sure than just assume. The initial problem comes from non-local goto as it tries to emit pseudo PIC register after reload. ? You mean it emits a reference to the pseudo into RTL? That would indicate that the allocators never put the pseudo into a hard register?!? RTL dumps with a few pointers to key insns would help here. Correct, that is why Darwin crashes with ICE on non-local goto. We still have: (define_insn_and_split nonlocal_goto_receiver [(unspec_volatile [(const_int 0)] UNSPECV_NLGR)] TARGET_MACHO !TARGET_64BIT flag_pic # reload_completed [(const_int 0)] { if (crtl-uses_pic_offset_table) { rtx xops[3];
Re: [PATCH] Fix bswap regression: expand 8bit rotations of 16bit values into bswaphi patterns
On 11/07/14 08:58, Thomas Preud'homme wrote: Trunk revision 216971 introduced LROTATE_EXPR as the canonical representation for a byte swap of a 2 bytes value, as per [1]. However, backend expects bswaphi patterns for such operation as these operation are more specific than a rotation. This led to a number of testcases starting to fail such as gcc.target/arm/builtin-bswap16-1.c and gcc.target/aarch64/builtin-bswap-2.c (these were skipped with my configuration). This patch adds a check in expmed to expand such LROTATE_EXPR into bswaphi pattern. [1] https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00616.html Note that this is unrelated to PR63761 (but I have diagnosed the root cause). ChangeLog entry is as follows: 2014-11-03 Thomas Preud'homme thomas.preudho...@arm.com * expmed.c (expand_shift_1): Expand 8 bit rotate of 16 bit value to bswaphi if available. Approved. Sorry for the noise WRT wider modes. Jeff
RE: [PATCH] Fix bswap regression: expand 8bit rotations of 16bit values into bswaphi patterns
From: Jeff Law [mailto:l...@redhat.com] Sent: Friday, November 07, 2014 8:48 PM ChangeLog entry is as follows: 2014-11-03 Thomas Preud'homme thomas.preudho...@arm.com * expmed.c (expand_shift_1): Expand 8 bit rotate of 16 bit value to bswaphi if available. Approved. Sorry for the noise WRT wider modes. Not at all, it's a valuable feedback about the need for a comment. Especially since I was slow to realize this. Best regards, Thomas
Re: [PARCH 1/2, x86, PR63534] Fix darwin bootstrap
Thanks. Committed to trunk with that fix: Author: kyukhin Date: Fri Nov 7 20:42:36 2014 New Revision: 217237 URL: https://gcc.gnu.org/viewcvs?rev=217237root=gccview=rev Log: PR target/63534 gcc/ * config/i386/i386.md (builtin_setjmp_receiver): Use pic_offset_table_rtx for PIC register. (nonlocal_goto_receiver): Delete. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.md On Fri, Nov 7, 2014 at 11:33 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Nov 6, 2014 at 5:01 AM, Evgeny Stupachenko evstu...@gmail.com wrote: Now I see that equiv reload could be special for PIC register. Let's apply more conservative patch. Darwin bootstrap passed with the patch applied on r216304 (along with already committed to trunk patches from PR63618 and PR63620). 2014-11-06 Evgeny Stupachenko evstu...@gmail.com PR target/63534 * config/i386/i386.c (builtin_setjmp_receiver): Use pic_offset_table_rtx for PIC register. (nonlocal_goto_receiver): Delete. It should be config/i386/i386.md, not config/i386/i386.c. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 7b1dd79..0df66ea 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -16989,10 +16989,9 @@ if (TARGET_MACHO) { rtx xops[3]; - rtx picreg = gen_rtx_REG (Pmode, PIC_OFFSET_TABLE_REGNUM); rtx_code_label *label_rtx = gen_label_rtx (); emit_insn (gen_set_got_labelled (pic_offset_table_rtx, label_rtx)); - xops[0] = xops[1] = picreg; + xops[0] = xops[1] = pic_offset_table_rtx; xops[2] = machopic_gen_offset (gen_rtx_LABEL_REF (SImode, label_rtx)); ix86_expand_binary_operator (MINUS, SImode, xops); } @@ -17002,36 +17001,6 @@ DONE; }) -(define_insn_and_split nonlocal_goto_receiver - [(unspec_volatile [(const_int 0)] UNSPECV_NLGR)] - TARGET_MACHO !TARGET_64BIT flag_pic - # - reload_completed - [(const_int 0)] -{ - if (crtl-uses_pic_offset_table) -{ - rtx xops[3]; - rtx label_rtx = gen_label_rtx (); - rtx tmp; - - /* Get a new pic base. */ - emit_insn (gen_set_got_labelled (pic_offset_table_rtx, label_rtx)); - /* Correct this with the offset from the new to the old. */ - xops[0] = xops[1] = pic_offset_table_rtx; - label_rtx = gen_rtx_LABEL_REF (SImode, label_rtx); - tmp = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, label_rtx), - UNSPEC_MACHOPIC_OFFSET); - xops[2] = gen_rtx_CONST (Pmode, tmp); - ix86_expand_binary_operator (MINUS, SImode, xops); -} - else -/* No pic reg restore needed. */ -emit_note (NOTE_INSN_DELETED); - - DONE; -}) - ;; Avoid redundant prefixes by splitting HImode arithmetic to SImode. ;; Do not split instructions with mask registers. (define_split On Wed, Nov 5, 2014 at 3:59 PM, Evgeny Stupachenko evstu...@gmail.com wrote: On Tue, Nov 4, 2014 at 1:40 AM, Jeff Law l...@redhat.com wrote: On 11/01/14 06:39, Evgeny Stupachenko wrote: When PIC register is pseudo there is nothing special about it's value that setjmp can hurt. So if the pseudo register lives across setjmp_receiver RA should care about correct allocation (in case it is not saved/restored, it should go on stack). gcc.dg tests and specs I've tested behave like this. If the allocator picked a call-clobbered register for the PIC register, then we're obviously OK since the setjmp has to be expected to clobber the PIC register. But if the PIC register is in a call-saved register, then it's going to be assumed to not be clobbered across calls and I don't believe that is guaranteed for builtin setjmp/longjmp. Those restore SP, FP and an ARGP, but not anything else by default. I still don't see what is special for PIC register here. PIC pseudo now behave as every other pseudo register. If we assume that setjmp can change a pseudo register value we need IRA/LRA magic for each pseudo register. I believe that when we had EBX fixed, IRA/LRA don't save/restore it anywhere. Therefore we had to care about EBX value in special cases like setjmp/non-local goto. Now RA cares about PIC pseudo as well as about correct allocation for any pseudo register. So the callee might have clobbered the call saved hard register, expecting to restore its value in its epilogue. But due to the longjmp, that epilogue never gets called and thus the call-saved register won't have the right value in the receiver. Now if your argument is that IRA/LRA handle this, that's fine, a pointer to that code would be appreciated so that it can be quickly audited. Certainly the old local-alloc/global-alloc had magic for setjmp/longjmp and maybe IRA/LRA does too, but it's better to be sure than just assume. The initial problem comes from non-local goto as it tries to emit pseudo PIC register after reload. ? You mean it emits a reference to the pseudo
Re: Fix ICE with thunks taking argument passed by reference
On 11/06/14 03:36, Jan Hubicka wrote: Hi, PR63573 is about ICE when expanding thunk call for function taking as a parameter structure passed by reference. This structure in fact contains only one integer and thus it is promoted to register by argument setup in function.c (as an optimization). This is an sensible optimization, but when expanding the tailcall we need memory location to store the value into to pass it again by reference. Because we lost the original memory location we ICE because we have addressable flag set on decl whose DECL_RTL is register. This patch fixes it up by reverting to original memory location in calls.c. This is of course not safe in general because the register value may be hcanged, but the path is executed only when THUNK flag of the call statement is set and that is set only in thunks where the values are not updated. Ugly hack but I can not think of better way to fix the ugly hacks already in there to make thunk expansion happen in tailcall. This fixed boostrap on ppc64-linux and the set of testsuite failures match ones before the bug was introduced. OK? PR bootstrap/63573 * calls.c (initialize_argument_information): When emitting thunk call use original memory placement of the argument. Index: calls.c === --- calls.c (revision 216942) +++ calls.c (working copy) @@ -1210,6 +1211,15 @@ initialize_argument_information (int num TREE_CODE (base) != SSA_NAME (!DECL_P (base) || MEM_P (DECL_RTL (base) { + /* Argument setup code may have copied the value to register. We +that optimization now because the tail call code must use +the original location. */ Second sentence in comment doesn't parse :-) Figure you meant We undo or We revert OK with comment fixed. And yes, this is clearly a hack. Jeff