Re: [patch] support for multiarch systems
Il 09/05/2012 19:19, Matthias Klose ha scritto: these are referenced from the http://wiki.debian.org/Multiarch/Tuples https://wiki.ubuntu.com/MultiarchSpec#Filesystem_layout http://err.no/debian/amd64-multiarch-3 http://wiki.debian.org/Multiarch/TheCaseForMultiarch describes use cases for multiarch, and why Debian thinks that the existing approaches are not sufficient (having name collisions for different architectures or ad hoc names for new architectures like libx32). That may be contentious within the Linux community, but I would like to avoid this kind of discussion here. I don't care about contentiousness, I just would like this to be documented somewhere (for example in the internals manual where MULTILIB_* is documented too). Paolo
Re: [C Patch]: pr52543
Il 30/03/2012 12:08, Richard Sandiford ha scritto: + There are two useful preprocessor defines for use by maintainers: + + #define LOG_COSTS + + if you wish to see the actual cost estimates that are being used + for each mode wider than word mode and the cost estimates for zero + extension and the shifts. This can be useful when port maintainers + are tuning insn rtx costs. + + #define FORCE_LOWERING + + if you wish to test the pass with all the transformation forced on. + This can be useful for finding bugs in the transformations. Must admit I'm not keen on these kinds of macro, but it's Ian's call. Indeed, LOG_COSTS should be (dump_flags TDF_DETAILS) != 0, and perhaps FORCE_LOWERING should be a -f flag (like -flower-all-subregs) or a --param. Paolo
Re: [C Patch]: pr52543
Il 10/05/2012 08:45, Paolo Bonzini ha scritto: Il 30/03/2012 12:08, Richard Sandiford ha scritto: + There are two useful preprocessor defines for use by maintainers: + + #define LOG_COSTS + + if you wish to see the actual cost estimates that are being used + for each mode wider than word mode and the cost estimates for zero + extension and the shifts. This can be useful when port maintainers + are tuning insn rtx costs. + + #define FORCE_LOWERING + + if you wish to test the pass with all the transformation forced on. + This can be useful for finding bugs in the transformations. Must admit I'm not keen on these kinds of macro, but it's Ian's call. Indeed, LOG_COSTS should be (dump_flags TDF_DETAILS) != 0, and perhaps FORCE_LOWERING should be a -f flag (like -flower-all-subregs) or a --param. Not sure how this got sent a month after I wrote it (and decided not to send it). :) Paolo
Re: [PATCH] Optimize byte_from_pos, pos_from_bit
On Wed, 9 May 2012, Eric Botcazou wrote: This optimizes byte_from_pos and pos_from_bit by noting that we operate on sizes whose computations have no intermediate (or final) overflow. This is the single patch necessary to get Ada to bootstrap and test with TYPE_IS_SIZETYPE removed. Rather than amending size_binop (my original plan) I chose to optimize the above two commonly used accessors. Conveniently normalize_offset can be re-written to use pos_from_bit instead of inlinig it. I also took the liberty to document the functions (sic). Nice, thanks. Could you add a blurb, in the head comment of the first function in which you operate under the no-overflow assumption, stating this fact and why this is necessary (an explicit mention of Ada isn't forbidden ;-), as well as a cross-reference to it in the head comment of the other function(s). Like this? Thanks, Richard. 2012-05-10 Richard Guenther rguent...@suse.de * stor-layout.c (byte_from_pos): Amend comment. Index: gcc/stor-layout.c === --- gcc/stor-layout.c (revision 187362) +++ gcc/stor-layout.c (working copy) @@ -798,7 +798,11 @@ bit_from_pos (tree offset, tree bitpos) } /* Return the combined truncated byte position for the byte offset OFFSET and - the bit position BITPOS. */ + the bit position BITPOS. + These functions operate on byte and bit positions as present in FIELD_DECLs + and it assumes that expressions result in no (intermediate) overflow. + This assumption is necessary to optimize these values as much as possible, + especially to make Ada happy. */ tree byte_from_pos (tree offset, tree bitpos)
Re: [patch] Fix LTO regression in Ada
On Wed, May 9, 2012 at 10:38 PM, Eric Botcazou ebotca...@adacore.com wrote: Hi, this is a regression present on mainline and 4.7 branch. On the attached testcase, the compiler aborts in LTO mode with: eric@atlantis:~/build/gcc/native32 gcc/xgcc -Bgcc -S lto11.adb -O -flto +===GNAT BUG DETECTED==+ | 4.8.0 20120506 (experimental) [trunk revision 187216] (i586-suse-linux) | tree code 'call_expr' is not supported in LTO streams The problem is that the Ada compiler started to use DECL_ORIGINAL_TYPE in 4.7.x and the type in this field can have arbitrary expressions as TYPE_SIZE, for example expressions with CALL_EXPRs. Now the type is both not gimplified and streamed in LTO mode, so the CALL_EXPRs are sent to the streamer as-is. The immediate solution would be not to stream DECL_ORIGINAL_TYPE (and clear it in free_lang_data_in_decl), but this yields a regression in C++ with -flto -g (ICE in splice_child_die). Therefore, the patch implements the alternate solution of gimplifying DECL_ORIGINAL_TYPE. Bootstrapped/regtested on x86_64-suse-linux, OK for mainline and 4.7 branch? Hmm, but we will not possibly refer to the sizes therein, so emitting stmts for them looks pointless ... (and in fact the debug information would be odd, too ... what does dwarf2out.c do with these CALL_EXPRs when generating debug information without LTO?) Anyway, the idea is reasonable, but I'm not sure ending up with calls in those sizes makes sense (don't we make sure to inline them all at some point?) Thanks, Richard. 2012-05-09 Eric Botcazou ebotca...@adacore.com * gimplify.c (gimplify_decl_expr): For a TYPE_DECL, gimplify the DECL_ORIGINAL_TYPE if it is present. 2012-05-09 Eric Botcazou ebotca...@adacore.com * gnat.dg/lto11.ad[sb]: New test. -- Eric Botcazou
Re: [PATCH] Remove TYPE_IS_SIZETYPE
On Wed, 9 May 2012, Eric Botcazou wrote: This removes the TYPE_IS_SIZETYPE macro and all its uses (by assuming it returns zero and applying trivial folding). Sizes and bitsizes can still be treat specially by means of knowing what the values represent and by means of using helper functions that assume you are dealing with sizes (in particular size_binop and friends and bit_from_pos, byte_from_pos or pos_from_bit). Fine with me, if you add the blurb I talked about in the other reply. Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages including Ada with the patch optimizing bute_from_pos and pos_from_bit Results on our internal testsuite are clean on x86-64 and almost clean on x86, an exception being: package t is type x (m : natural) is record s : string (1 .. m); r : natural; b : boolean; end record; for x'alignment use 4; pragma Pack (x); end t; Without the patches, compiling the package with -gnatR3 yields: Representation information for unit t (spec) for x'Object_Size use 17179869248; for x'Value_Size use ((#1 + 8) * 8) ; for x'Alignment use 4; for x use record m at 0 range 0 .. 30; s at 4 range 0 .. ((#1 * 8)) - 1; r at bit offset (((#1 + 4) * 8)) size in bits = 31 b at bit offset #1 + 7) * 8) + 7)) size in bits = 1 end record; With the patches, this yields: Representation information for unit t (spec) for x'Object_Size use 17179869248; for x'Value_Size use (((#1 + 7) + 1) * 8) ; for x'Alignment use 4; for x use record m at 0 range 0 .. 30; s at 4 range 0 .. ((#1 * 8)) - 1; r at bit offset (((#1 + 4) * 8)) size in bits = 31 b at bit offset #1 + 7) * 8) + 7)) size in bits = 1 end record; so we have lost a simple folding for x'Value_Size (TYPE_ADA_SIZE field). That's interesting. It is always safe to fold (x + 7) + 1 to (x + 8), independent on whether overflow is defined or not. So this looks like a genuine missed folding (I think that the combiner in tree-ssa-forwprop.c catches this). Or is the above not showing casts in the expression? Folding would be not valid for (unsigned)(signed X + 7) + 1. 2012-05-08 Richard Guenther rguent...@suse.de ada/ * gcc-interface/cuintp.c (UI_From_gnu): Remove TYPE_IS_SIZETYPE use. OK, modulo the formatting: Adjusted and applied. Thanks, Richard. Index: trunk/gcc/ada/gcc-interface/cuintp.c === *** trunk.orig/gcc/ada/gcc-interface/cuintp.c 2011-04-11 17:01:30.0 +0200 --- trunk/gcc/ada/gcc-interface/cuintp.c 2012-05-07 16:43:43.497218058 +0200 *** UI_From_gnu (tree Input) *** 178,186 if (host_integerp (Input, 0)) return UI_From_Int (TREE_INT_CST_LOW (Input)); else if (TREE_INT_CST_HIGH (Input) 0 ! TYPE_UNSIGNED (gnu_type) ! !(TREE_CODE (gnu_type) == INTEGER_TYPE !TYPE_IS_SIZETYPE (gnu_type))) return No_Uint; #endif --- 178,184 if (host_integerp (Input, 0)) return UI_From_Int (TREE_INT_CST_LOW (Input)); else if (TREE_INT_CST_HIGH (Input) 0 ! TYPE_UNSIGNED (gnu_type)) return No_Uint; #endif TYPE_UNSIGNED (gnu_type)) on the same line.
Re: [Patch / RFC] Improving more locations for binary expressions
On 10 May 2012 07:55, Miles Bader mi...@gnu.org wrote: Paolo Carlini paolo.carl...@oracle.com writes: in case my message ends up garbled, the carets do not point to (column 13), two times point to b (column 20), which is obviously wrong. In other terms, all the columns are 20, all wrong. The new caret support does seem to have revealed a bunch of places where the column info in error messages was pretty screwy... I guess nobody paid much attention to it before... :] Should these get reported as bugzilla bugs...? In principle, yes. In practice, there are already so many known issues that adding more would just waste contributors time doing bugzilla administration. So help would very much appreciated. In particular, the C FE and the preprocessor are in much worse shape in terms of locations than the C++ FE. Some issues may be hard but many of them are a matter of setting a breakpoint at the error, going up the frame, and figuring out where the correct location could be got from. Then passing it down to the error so it can use the correct location. If you can figure out that but can't/won't write a patch, then please open a PR. Cheers, Manuel.
Re: [PATCH libcpp]: Avoid crash in interpret_float_suffix
On May 8, 2012, at 5:39 PM, Tom Tromey wrote: Tristan == Tristan Gingold ging...@adacore.com writes: Tristan 2012-05-04 Tristan Gingold ging...@adacore.com Tristan * expr.c (interpret_float_suffix): Add a guard. Ok. Thanks, now committed.
Re: Bug 53289 - unnecessary repetition of caret diagnostics
On Wed, May 9, 2012 at 11:02 PM, Manuel López-Ibáñez lopeziba...@gmail.com wrote: Simple enough. Bootstrapped and regression tested. The output for the example in the PR is now: /home/manuel/caret-overload.C:6:6: error: no matching function for call to ‘g(int)’ g(1); ^ /home/manuel/caret-overload.C:6:6: note: candidate is: /home/manuel/caret-overload.C:2:18: note: templateclass T typename T::type g(T) typename T::type g(T); ^ Does it make sense to print a caret here? We are dumping a function decl(?), thus already constraining what we print to exactly what is important. So - maybe simply never emit a caret for %D locations? /home/manuel/caret-overload.C:2:18: note: template argument deduction/substitution failed: /home/manuel/caret-overload.C: In substitution of ‘templateclass T typename T::type g(T) [with T = int]’: /home/manuel/caret-overload.C:6:6: required from here /home/manuel/caret-overload.C:2:18: error: ‘int’ is not a class, struct, or union type OK? 2012-05-09 Manuel López-Ibáñez m...@gcc.gnu.org PR c++/53289 gcc/ * diagnostic.h (diagnostic_context): Add last_location. * diagnostic.c (diagnostic_initialize): Initialize it. (diagnostic_show_locus): Use it.
Re: [C++ Patch] fix semi-random template specialization ICE
Alexandre Oliva aol...@redhat.com a écrit: [...] Anyway, the problem is that, for some unsuitable candidate template specializations, tsubst returns error_mark_node, which tsubst_decl stores in argvec, and later on register_specialization gets this error_mark_node and tries to access it as a tree_vec. The trivial patch that avoids the misbehavior is returning error_mark_node as soon as we get that for argvec. Bootstrapped on i686-pc-linux-gnu and x86_64-linux-gnu, regstrapped on the latter. Ok to install? FYI, this has been reported as http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53209, so you might add a reference to that bug in the ChangeLog. Other than that, I cannot approve or reject this patch, but FWIW, it looks fine to me. Let's CC jason. for gcc/cp/ChangeLog from Alexandre Oliva aol...@redhat.com * pt.c (tsubst_decl): Bail out if argvec is error_mark_node. Index: gcc/cp/pt.c === --- gcc/cp/pt.c.orig 2012-04-30 15:34:44.018432544 -0300 +++ gcc/cp/pt.c 2012-04-30 15:34:47.988375071 -0300 @@ -10626,6 +10626,8 @@ tsubst_decl (tree t, tree args, tsubst_f tmpl = DECL_TI_TEMPLATE (t); gen_tmpl = most_general_template (tmpl); argvec = tsubst (DECL_TI_ARGS (t), args, complain, in_decl); + if (argvec == error_mark_node) + RETURN (error_mark_node); hash = hash_tmpl_and_args (gen_tmpl, argvec); spec = retrieve_specialization (gen_tmpl, argvec, hash); } Thanks. -- Dodji
Re: [PATCH] Add option for dumping to stderr (issue6190057)
On Thu, May 10, 2012 at 2:31 AM, Xinliang David Li davi...@google.com wrote: Bummer. I was thinking to reserve '=' for selective dumping: -fdump-tree-pre=func_list_regexp I guess this can be achieved via @ -fdump-tree-pre@func_list -fdump-tree-pre=file_name@func_list Another issue -- I don't think the current precedence rule is correct. Consider that -fopt-info=2 will be mapped to -fdump-tree-all-transform-verbose2=stderr -fdump-rtl-all-transform-verbose2=stderr then the current precedence rule will cause surprise when the following is used -fopt-info -fdump-tree-pre The PRE dump will be emitted to stderr which is not what user wants. In short, special streams should be treated as 'weak' the same way as your previous implementation. Hm, this raises a similar concern I have with the -fvectorizer-verbose flag. With -fopt-info -fdump-tree-pre I do not want some information to be present only on stderr or in the dump file! I want it in _both_ places! (-fvectorizer-verbose makes the -fdump-tree-vect dump contain less information :() Thus, the information where dumping goes has to be done differently (which is why I asked for some re-org originally, so that passes no longer explicitely reference dump_file - dump_file may be different for different kind of information it dumps!). Passes should, instead of fprintf (dump_file, ..., ...) do dump_printf (TDF_scev, ..., ...) thus, specify the kind of information they dump (would be mostly TDF_details vs. 0 today I guess). The dump_printf routine would then properly direct to one or more places to dump at. I realize this needs some more dispatchers for dumping expressions and statements (but it should not be too many). Dumping to dump_file would in any case dump to the passes private dump file only (unqualified stuff would never be useful for -fopt-info). The perfect candidate to convert to this kind of scheme is obviously the vectorizer with its existing -fvectorizer-verbose. If the patch doesn't work towards this kind of end-result I'd rather not have it. Thanks, Richard. thanks, David On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai sing...@google.com wrote: Thanks for your suggestions/comments. I have updated the patch and documentation. It supports the following usage: gcc -fdump-tree-all=tree.dump -fdump-tree-pre=stdout -fdump-rtl-ira=ira.dump Here all tree dumps except the PRE are output into tree.dump, PRE dump goes to stdout and the IRA dump goes to ira.dump. Thanks, Sharad 2012-05-09 Sharad Singhai sing...@google.com * doc/invoke.texi: Add documentation for the new option. * tree-dump.c (dump_get_standard_stream): New function. (dump_files): Update for new field. (dump_switch_p_1): Handle dump filenames. (dump_begin): Likewise. (get_dump_file_name): Likewise. (dump_end): Remove attribute. (dump_enable_all): Add new parameter FILENAME. All callers updated. (enable_rtl_dump_file): * tree-pass.h (enum tree_dump_index): Add new constant. (struct dump_file_info): Add new field FILENAME. * testsuite/g++.dg/other/dump-filename-1.C: New test. Index: doc/invoke.texi === --- doc/invoke.texi (revision 187265) +++ doc/invoke.texi (working copy) @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio @item -d@var{letters} @itemx -fdump-rtl-@var{pass} +@itemx -fdump-rtl-@var{pass}=@var{filename} @opindex d Says to make debugging dumps during compilation at times specified by @var{letters}. This is used for debugging the RTL-based passes of the compiler. The file names for most of the dumps are made by appending a pass number and a word to the @var{dumpname}, and the files are -created in the directory of the output file. Note that the pass -number is computed statically as passes get registered into the pass -manager. Thus the numbering is not related to the dynamic order of -execution of passes. In particular, a pass installed by a plugin -could have a number over 200 even if it executed quite early. -@var{dumpname} is generated from the name of the output file, if -explicitly specified and it is not an executable, otherwise it is the -basename of the source file. These switches may have different effects -when @option{-E} is used for preprocessing. +created in the directory of the output file. If the +@option{=@var{filename}} is appended to the longer form of the dump +option then the dump is done on that file instead of numbered +files. Note that the pass number is computed statically as passes get +registered into the pass manager. Thus the numbering is not related +to the dynamic order of execution of passes. In particular, a pass +installed by a plugin could have a number over 200 even if it executed +quite early. @var{dumpname} is generated from the name
Re: [PATCH] Add option for dumping to stderr (issue6190057)
Okay, I have restored the original behavior where standard streams were considered weak. Thus in case of a conflict, the standard streams have lower precedence. For example, gcc -O2 -fdump-tree-pre=stdout -fdump-tree-pre ... does the PRE dump in auto numbered file since stdout has lower precedence. Also this works as expected, gcc -O2 -fdump-tree-pre=pre.txt -fdump-tree-all=stderr ... It outputs PRE dump to pre.txt while the remaining tree dumps are output on to stderr. Does it look okay? Thanks, Sharad 2012-05-09 Sharad Singhai sing...@google.com * doc/invoke.texi: Add documentation for the new option. * tree-dump.c (dump_get_standard_stream): New function. (dump_files): Update for new field. (dump_switch_p_1): Handle dump filenames. (dump_begin): Likewise. (get_dump_file_name): Likewise. (dump_end): Remove attribute. (dump_enable_all): Add new parameter FILENAME. All callers updated. * tree-pass.h (enum tree_dump_index): Add new constant. (struct dump_file_info): Add new field FILENAME. * testsuite/g++.dg/other/dump-filename-1.C: New test. Index: doc/invoke.texi === --- doc/invoke.texi (revision 187265) +++ doc/invoke.texi (working copy) @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio @item -d@var{letters} @itemx -fdump-rtl-@var{pass} +@itemx -fdump-rtl-@var{pass}=@var{filename} @opindex d Says to make debugging dumps during compilation at times specified by @var{letters}. This is used for debugging the RTL-based passes of the compiler. The file names for most of the dumps are made by appending a pass number and a word to the @var{dumpname}, and the files are -created in the directory of the output file. Note that the pass -number is computed statically as passes get registered into the pass -manager. Thus the numbering is not related to the dynamic order of -execution of passes. In particular, a pass installed by a plugin -could have a number over 200 even if it executed quite early. -@var{dumpname} is generated from the name of the output file, if -explicitly specified and it is not an executable, otherwise it is the -basename of the source file. These switches may have different effects -when @option{-E} is used for preprocessing. +created in the directory of the output file. If the +@option{=@var{filename}} is appended to the longer form of the dump +option then the dump is done on that file instead of numbered +files. Note that the pass number is computed statically as passes get +registered into the pass manager. Thus the numbering is not related +to the dynamic order of execution of passes. In particular, a pass +installed by a plugin could have a number over 200 even if it executed +quite early. @var{dumpname} is generated from the name of the output +file, if explicitly specified and it is not an executable, otherwise +it is the basename of the source file. These switches may have +different effects when @option{-E} is used for preprocessing. Debug dumps can be enabled with a @option{-fdump-rtl} switch or some @option{-d} option @var{letters}. Here are the possible @@ -5719,15 +5722,18 @@ counters for each function compiled. @item -fdump-tree-@var{switch} @itemx -fdump-tree-@var{switch}-@var{options} +@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename} @opindex fdump-tree Control the dumping at various stages of processing the intermediate language tree to a file. The file name is generated by appending a switch specific suffix to the source file name, and the file is -created in the same directory as the output file. If the -@samp{-@var{options}} form is used, @var{options} is a list of -@samp{-} separated options which control the details of the dump. Not -all options are applicable to all dumps; those that are not -meaningful are ignored. The following options are available +created in the same directory as the output file. In case of +@option{=@var{filename}} option, the dump is output on the given file +name instead. If the @samp{-@var{options}} form is used, +@var{options} is a list of @samp{-} separated options which control +the details or location of the dump. Not all options are applicable +to all dumps; those that are not meaningful are ignored. The +following options are available @table @samp @item address @@ -5765,9 +5771,49 @@ Enable showing the tree dump for each statement. Enable showing the EH region number holding each statement. @item scev Enable showing scalar evolution analysis details. +@item slim +Inhibit dumping of members of a scope or body of a function merely +because that scope has been reached. Only dump such items when they +are directly reachable by some other path. When dumping pretty-printed +trees, this option inhibits dumping the bodies of control structures. +@item
[MIPS] Fix misspelled macro in t-vxworks
Hello, This patch fix the misspelled macro in t-vxworks. Is it OK? 2012-05-10 Mingjie Xing mingjie.x...@gmail.com * config/mips/t-vxworks: Change MUTLILIB_EXTRA_OPTS to MULTILIB_EXTRA_OPTS. Index: config/mips/t-vxworks === --- config/mips/t-vxworks (revision 187364) +++ config/mips/t-vxworks (working copy) @@ -32,4 +32,4 @@ MULTILIB_EXCEPTIONS = mips3* mabi=o64 fP $(addprefix mabi=o64/, EL* msoft-float* mrtp* fPIC*) \ $(addsuffix /fPIC, *mabi=o64 *mips3 *EL *msoft-float) -MUTLILIB_EXTRA_OPTS = -G 0 -mno-branch-likely +MULTILIB_EXTRA_OPTS = -G 0 -mno-branch-likely Thanks, Mingjie
Re: [libgcc] Use i386-cpuinfo.c on all i386 targets
Paolo Bonzini bonz...@gnu.org writes: 2012-04-26 Rainer Orth r...@cebitec.uni-bielefeld.de libgcc: * config.host (i[34567]86-*-linux*, x86_64-*-linux*) (i[34567]86-*-kfreebsd*-gnu, x86_64-*-kfreebsd*-gnu) (i[34567]86-*-knetbsd*-gnu, i[34567]86-*-gnu*): Move i386/t-cpuinfo ... (i[34567]86-*-*, x86_64-*-*): ... here. * config/i386/libgcc-bsd.ver (GCC_4.8.0): New version. * config/i386/libgcc-sol2.ver (GCC_4.8.0): New version. * config/i386/i386-cpuinfo.c: Rename to ... * config/i386/cpuinfo.c: ... this. * config/i386/t-cpuinfo (LIB2ADD): Reflect this. * configure.ac (AC_CONFIG_HEADER): Call for auto-target.h. (libgcc_cv_init_priority): New test. * configure: Regenerate. * config.in: New file. * Makefile.in (clean): Rename config.h to auto-target.h. (config.h): Likewise. (stamp-h): Likewise. * config/i386/cpuinfo.c (auto-target.h): Include. (CONSTRUCTOR_PRIORITY): Define. (__cpu_indicator_init): Use it. gcc * config/i386/i386.c: Update comments for i386-cpuinfo.c name change. Looks good. Given that there were no further comments, I've committed the patch with the following doc snippet added, after bootstrapping on i386-pc-solaris2.10 with as and x86_64-unknown-linux-gnu. Thanks. Rainer * doc/extend.texi (X86 Built-in Functions, __builtin_cpu_init): Document requirement to call in constructors. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -9432,8 +9432,9 @@ executed before any constructors are cal automatically executed in a very high priority constructor. For example, this function has to be used in @code{ifunc} resolvers which -check for CPU type using the builtins, @code{__builtin_cpu_is} -and @code{__builtin_cpu_supports}. +check for CPU type using the builtins @code{__builtin_cpu_is} +and @code{__builtin_cpu_supports}, or in constructors on targets which +don't support constructor priority. @smallexample static void (*resolve_memcpy (void)) (void) -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[libatomic] Always compile atomic builtin tests with $XCFLAGS (PR other/53284)
As described in the PR, several 32-bit libatomic tests FAIL on Solaris/x86 with infinite recursion e.g. in __atomic_compare_exchange_8. It turns out that this happens because, unlike on glibc targets, the atomic builtin configure tests are run as compile tests, but are currently not compiled with $XCFLAGS, unlike the real code. The following patch fixes this, tested on i386-pc-solaris2.10 and x86_64-unknown-linux-gnu, approved by rth in the PR, installed on mainline. Rainer 2012-05-09 Rainer Orth r...@cebitec.uni-bielefeld.de PR other/53284 * acinclude.m4 (LIBAT_TEST_ATOMIC_BUILTIN): Add -O0 -S to CFLAGS instead of overriding. * configure: Regenerate. # HG changeset patch # Parent 6c6136d0a9792bfd3fe9600f7867d5edf5c9b114 Always compile atomic builtin tests with $XCFLAGS diff --git a/libatomic/acinclude.m4 b/libatomic/acinclude.m4 --- a/libatomic/acinclude.m4 +++ b/libatomic/acinclude.m4 @@ -67,7 +67,7 @@ AC_DEFUN([LIBAT_TEST_ATOMIC_BUILTIN],[ else old_CFLAGS=$CFLAGS # Compile unoptimized. - CFLAGS='-O0 -S' + CFLAGS=$CFLAGS -O0 -S if AC_TRY_EVAL(ac_compile); then if grep __atomic_ conftest.s /dev/null 21 ; then eval $2=no -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Remove TYPE_IS_SIZETYPE
On Thu, 10 May 2012, Richard Guenther wrote: On Wed, 9 May 2012, Eric Botcazou wrote: This removes the TYPE_IS_SIZETYPE macro and all its uses (by assuming it returns zero and applying trivial folding). Sizes and bitsizes can still be treat specially by means of knowing what the values represent and by means of using helper functions that assume you are dealing with sizes (in particular size_binop and friends and bit_from_pos, byte_from_pos or pos_from_bit). Fine with me, if you add the blurb I talked about in the other reply. Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages including Ada with the patch optimizing bute_from_pos and pos_from_bit Results on our internal testsuite are clean on x86-64 and almost clean on x86, an exception being: package t is type x (m : natural) is record s : string (1 .. m); r : natural; b : boolean; end record; for x'alignment use 4; pragma Pack (x); end t; Without the patches, compiling the package with -gnatR3 yields: Representation information for unit t (spec) for x'Object_Size use 17179869248; for x'Value_Size use ((#1 + 8) * 8) ; for x'Alignment use 4; for x use record m at 0 range 0 .. 30; s at 4 range 0 .. ((#1 * 8)) - 1; r at bit offset (((#1 + 4) * 8)) size in bits = 31 b at bit offset #1 + 7) * 8) + 7)) size in bits = 1 end record; With the patches, this yields: Representation information for unit t (spec) for x'Object_Size use 17179869248; for x'Value_Size use (((#1 + 7) + 1) * 8) ; for x'Alignment use 4; for x use record m at 0 range 0 .. 30; s at 4 range 0 .. ((#1 * 8)) - 1; r at bit offset (((#1 + 4) * 8)) size in bits = 31 b at bit offset #1 + 7) * 8) + 7)) size in bits = 1 end record; so we have lost a simple folding for x'Value_Size (TYPE_ADA_SIZE field). That's interesting. It is always safe to fold (x + 7) + 1 to (x + 8), independent on whether overflow is defined or not. So this looks like a genuine missed folding (I think that the combiner in tree-ssa-forwprop.c catches this). Or is the above not showing casts in the expression? Folding would be not valid for (unsigned)(signed X + 7) + 1. As far as I can see this happens when we fold (bitsizetype) (#1 + 7) * 8 + 7 PLUS_EXPR 1 which we fold to ((bitsizetype) (#1 + 7) + 1) * 8 The #1 + 7 expression is computed in sizetype (which is now unsigned and thus has defined overflow - thus we cannot optimize the widening to bitsizetype). Equivalent C testcase: unsigned long long foo (unsigned int x) { return ((unsigned long long)(x + 7)) + 1; } As I previously suggested we can put in special knowledge into size_binop, or maybe better, provide abstraction for conversion of sizetype to bitsizetype that would associate the type conversions. The original plan was of course to at some point have PLUSNV_EXPR so we can explicitely mark #1 + 7 as not overflowing. It might be that introducing those just for size expressions right now (and then dropping them down to regular PLUS_EXPRs during gimplification) might be something to explore for 4.8. Richard. 2012-05-08 Richard Guenther rguent...@suse.de ada/ * gcc-interface/cuintp.c (UI_From_gnu): Remove TYPE_IS_SIZETYPE use. OK, modulo the formatting: Adjusted and applied. Thanks, Richard. Index: trunk/gcc/ada/gcc-interface/cuintp.c === *** trunk.orig/gcc/ada/gcc-interface/cuintp.c 2011-04-11 17:01:30.0 +0200 --- trunk/gcc/ada/gcc-interface/cuintp.c2012-05-07 16:43:43.497218058 +0200 *** UI_From_gnu (tree Input) *** 178,186 if (host_integerp (Input, 0)) return UI_From_Int (TREE_INT_CST_LOW (Input)); else if (TREE_INT_CST_HIGH (Input) 0 ! TYPE_UNSIGNED (gnu_type) ! !(TREE_CODE (gnu_type) == INTEGER_TYPE ! TYPE_IS_SIZETYPE (gnu_type))) return No_Uint; #endif --- 178,184 if (host_integerp (Input, 0)) return UI_From_Int (TREE_INT_CST_LOW (Input)); else if (TREE_INT_CST_HIGH (Input) 0 ! TYPE_UNSIGNED (gnu_type)) return No_Uint; #endif TYPE_UNSIGNED (gnu_type)) on the same line. -- Richard Guenther rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer
Re: [PATCH] Optimize byte_from_pos, pos_from_bit
Like this? Let's be a bit more factual. :-) /* Return the combined truncated byte position for the byte offset OFFSET and the bit position BITPOS. These functions operate on byte and bit positions present in FIELD_DECLs and assume that these expressions result in no (intermediate) overflow. This assumption is necessary to fold the expressions as much as possible, so as to avoid creating artificially variable-sized types in languages supporting variable-sized types like Ada. */ -- Eric Botcazou
Re: [patch] Fix LTO regression in Ada
Hmm, but we will not possibly refer to the sizes therein, so emitting stmts for them looks pointless ... (and in fact the debug information would be odd, too ... what does dwarf2out.c do with these CALL_EXPRs when generating debug information without LTO?) All variable-sized types currently get size -1, but this would change with: http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00422.html Anyway, the idea is reasonable, but I'm not sure ending up with calls in those sizes makes sense (don't we make sure to inline them all at some point?) These types are like any other types in Ada. Either DECL_ORIGINAL_TYPE is purely for debug info and then we shouldn't stream it for LTO or it isn't and it needs to be gimplified. My personal inclination would be for the former, but this is apparently problematic for C++. I can add a ??? comment though. -- Eric Botcazou
Re: [PATCH] Optimize byte_from_pos, pos_from_bit
On Thu, 10 May 2012, Eric Botcazou wrote: Like this? Let's be a bit more factual. :-) /* Return the combined truncated byte position for the byte offset OFFSET and the bit position BITPOS. These functions operate on byte and bit positions present in FIELD_DECLs and assume that these expressions result in no (intermediate) overflow. This assumption is necessary to fold the expressions as much as possible, so as to avoid creating artificially variable-sized types in languages supporting variable-sized types like Ada. */ Works for me. Applied that way. Thanks, Richard.
Re: [PATCH] Remove TYPE_IS_SIZETYPE
As far as I can see this happens when we fold (bitsizetype) (#1 + 7) * 8 + 7 PLUS_EXPR 1 which we fold to ((bitsizetype) (#1 + 7) + 1) * 8 The #1 + 7 expression is computed in sizetype (which is now unsigned and thus has defined overflow - thus we cannot optimize the widening to bitsizetype). I see, thanks for the investigation. As I previously suggested we can put in special knowledge into size_binop, or maybe better, provide abstraction for conversion of sizetype to bitsizetype that would associate the type conversions. The original plan was of course to at some point have PLUSNV_EXPR so we can explicitely mark #1 + 7 as not overflowing. It might be that introducing those just for size expressions right now (and then dropping them down to regular PLUS_EXPRs during gimplification) might be something to explore for 4.8. OK, I'll think about it. No objections by me to going ahead with the patches. -- Eric Botcazou
Re: [patch] Fix LTO regression in Ada
On Thu, May 10, 2012 at 12:12 PM, Eric Botcazou ebotca...@adacore.com wrote: Hmm, but we will not possibly refer to the sizes therein, so emitting stmts for them looks pointless ... (and in fact the debug information would be odd, too ... what does dwarf2out.c do with these CALL_EXPRs when generating debug information without LTO?) All variable-sized types currently get size -1, but this would change with: http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00422.html Anyway, the idea is reasonable, but I'm not sure ending up with calls in those sizes makes sense (don't we make sure to inline them all at some point?) These types are like any other types in Ada. Either DECL_ORIGINAL_TYPE is purely for debug info and then we shouldn't stream it for LTO or it isn't and it needs to be gimplified. My personal inclination would be for the former, but this is apparently problematic for C++. I can add a ??? comment though. Well, we need to stream it for LTO because at the moment LTO is still responsible for emitting debug information ... (that would change with the early debug info plan). Ok with a ??? comment. Thanks, Richard. -- Eric Botcazou
Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin
On 9 May 2012 11:18, Christophe Lyon christophe.l...@st.com wrote: Hello, On ARM+Neon, the expansion of vld1q_dup_s64() and vld1q_dup_u64() builtins currently fails to load the second vector element. Thanks for the patch but this is not acceptable as it stands today. You need to set the length attributes in this case to 8 for the appropriate alternative at the very least. You also don't mention how this patch was tested. Alternatively it might be worth splitting the vld1q_*64 case into a 64 bit load into a (subreg:DI (V2DI reg) 0 ) followed by a subreg to subreg move which should end up having the same effect . That splitting would allow for better instruction scheduling. In addition it would be nice to have a testcase in gcc.target/arm . As a follow up patch I'd like these patterns merged with the vdup_n patterns in neon.md (allowing them to grow a memory operand variant) which should then allow merging of (I think) scalarval = scalar_load () vreg = vdup ( scalarval) into vreg = vld1_dup_n ( scalar_address). Thanks, Ramana
Missing guard in ira-color.c ?
Hi, I am getting a segfault in ira-color.c:2945 on the trunk: Program received signal SIGSEGV, Segmentation fault. 0x00a79f37 in move_spill_restore () at ../../src/gcc/ira-color.c:2945 2945 || ira_reg_equiv_const[regno] != NULL_RTX (gdb) l 2940 /* don't do the optimization because it can create 2941 copies and the reload pass can spill the allocno set 2942 by copy although the allocno will not get memory 2943 slot. */ 2944 || ira_reg_equiv_invariant_p[regno] 2945 || ira_reg_equiv_const[regno] != NULL_RTX 2946 || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM (a))) 2947continue; 2948 mode = ALLOCNO_MODE (a); 2949 rclass = ALLOCNO_CLASS (a); while building gcc (gnatcmd.adb file) for ia64-vms using a cross compiler (target=ia64-vms, host=x86_64-linux). The reason looks to be an out of bounds access: (gdb) print regno $10 = 18476 (gdb) print ira_reg_equiv_len $11 = 17984 (I suppose this setup is not easy at all to reproduce, but I can provide any files, if necessary). Wild guess, as I don't know IRA at all: looks like in this file most accesses to ira_reg_equiv_* are guarded. Is it expected that they aren't at this point ? [I am currently trying with the following chunk: --- a/gcc/ira-color.c +++ b/gcc/ira-color.c @@ -2941,8 +2941,9 @@ move_spill_restore (void) copies and the reload pass can spill the allocno set by copy although the allocno will not get memory slot. */ - || ira_reg_equiv_invariant_p[regno] - || ira_reg_equiv_const[regno] != NULL_RTX + || (regno ira_reg_equiv_len + (ira_reg_equiv_invariant_p[regno] + || ira_reg_equiv_const[regno] != NULL_RTX)) || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM (a))) continue; mode = ALLOCNO_MODE (a); ] Thanks for any comment, Tristan.
Re: [C++ Patch] fix semi-random template specialization ICE
OK. Jason
Re: [Patch / RFC] Improving more locations for binary expressions
Looks good. Jason
Re: [C++ Patch] PR 53158 (EXPR_LOC_OR_HERE version)
On 05/09/2012 02:47 PM, Paolo Carlini wrote: + error_at(loc, cannot bind bitfield %qE to %qT, Missing a space. OK with that change. Jason
Re: [C++ Patch] PR 53301
Hi, On 05/09/2012 07:12 PM, Paolo Carlini wrote: shame on me. I think the patch almost qualifies as obvious. I think it does. OK. Good, later today I'll commit it (branch too). Was thinking: would it make sense to have a predicate for 'any' pointer type? I see tens of such || around and I bet I would not have typoed it here... If you agree, please pick a name and I will do the work ;) Paolo
Re: [C++ Patch] PR 53301
On 05/10/2012 10:52 AM, Paolo Carlini wrote: Was thinking: would it make sense to have a predicate for 'any' pointer type? Something like TYPE_PTR_OR_PTRMEM_P would be fine. Hmm, I see that TYPE_PTRMEM_P only means pointer to data member, that's unfortunate; the name doesn't make that clear. Jason
Re: [C++ Patch] PR 53301
Hi, On 05/10/2012 10:52 AM, Paolo Carlini wrote: Was thinking: would it make sense to have a predicate for 'any' pointer type? Something like TYPE_PTR_OR_PTRMEM_P would be fine. Good. Hmm, I see that TYPE_PTRMEM_P only means pointer to data member, that's unfortunate; the name doesn't make that clear. So, let's have a plan about the names of such predicates and I'll implement it as soon as possible. Paolo
Re: [h8300] increase dwarf address size
On 05/09/2012 06:27 PM, DJ Delorie wrote: H8/300 cpus have a larger-than-64k address space, despite 16-bit pointers. OK to apply? Ok for 4.7 branch? See also http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48231 * config/h8300/h8300.h (DWARF2_ADDR_SIZE): Define as 4 bytes. My recollection was that the H8/300 only had a 64k address space and that the larger address spaces showed up in later processors (H8/300H). Regardless, shouldn't DWARF2_ADDR_SIZE be POINTER_SIZE / BITS_PER_UNIT? That'll give the larger DWARF2_ADDR_SIZE on the modern widgets, but still do the right thing for the ancient H8/300. My other relevant recollection was that we don't support C++ on the H8/300 series. jeff
Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin
On 10.05.2012 13:41, Ramana Radhakrishnan wrote: On 9 May 2012 11:18, Christophe Lyonchristophe.l...@st.com wrote: Hello, On ARM+Neon, the expansion of vld1q_dup_s64() and vld1q_dup_u64() builtins currently fails to load the second vector element. Thanks for the patch but this is not acceptable as it stands today. You need to set the length attributes in this case to 8 for the appropriate alternative at the very least. OK I'll look at this. You also don't mention how this patch was tested. I used the testsuite I developed some time ago to test all the Neon builtins, which I posted last year on the qemu mailing-list. With the current GCCs, this bug is the only remaining one I could detect. Alternatively it might be worth splitting the vld1q_*64 case into a 64 bit load into a (subreg:DI (V2DI reg) 0 ) followed by a subreg to subreg move which should end up having the same effect . That splitting would allow for better instruction scheduling. Are you aware of examples of similar cases I could use as a model? In addition it would be nice to have a testcase in gcc.target/arm . Well. Prior to sending my patch I did look at that directory, but I supposed that such a test ought to belong to the neon/ subdir where the tests are described as autogenerated. Any doc on how to do that? Thanks, Christophe.
Re: [C++ Patch] PR 53301
On 10 May 2012 17:02, Paolo Carlini pcarl...@gmail.com wrote: Hi, On 05/10/2012 10:52 AM, Paolo Carlini wrote: Was thinking: would it make sense to have a predicate for 'any' pointer type? Something like TYPE_PTR_OR_PTRMEM_P would be fine. Good. Hmm, I see that TYPE_PTRMEM_P only means pointer to data member, that's unfortunate; the name doesn't make that clear. So, let's have a plan about the names of such predicates and I'll implement it as soon as possible. Yes, please. It feels as if the names are based more on the underlying implementation of the macro than on anything else. Also, short names are nice, but using MEM instead of MEMBER is a bit too short. The same for OB for object and others. PTR_OR_PTRMEM sounds to me like pointer or pointer to member, which sounds redundant since a pointer to member is a pointer already. And there is also TYPE_PTRMEM_P and TYPE_PTR_TO_MEMBER_P. From the names it is not clear what is the difference. This could be TYPE_PTR_TO_DATA_MEMBER and TYPE_PTR_TO_ANY_MEMBER. The few extra chars help a lot to clarify the meaning. Also tree.h already has POINTER_TYPE_P, what is the difference? There are a few other such accessors where the names seem to match with other accessors from cp-tree.h, but the implementations are a bit different. And both forms are used in cp/. Quite a mess... Cheers, Manuel.
Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin
On Thu, 10 May 2012 17:31:43 +0200 Christophe Lyon christophe.l...@st.com wrote: On 10.05.2012 13:41, Ramana Radhakrishnan wrote: On 9 May 2012 11:18, Christophe Lyonchristophe.l...@st.com wrote: Hello, On ARM+Neon, the expansion of vld1q_dup_s64() and vld1q_dup_u64() builtins currently fails to load the second vector element. Thanks for the patch but this is not acceptable as it stands today. You need to set the length attributes in this case to 8 for the appropriate alternative at the very least. OK I'll look at this. You also don't mention how this patch was tested. I used the testsuite I developed some time ago to test all the Neon builtins, which I posted last year on the qemu mailing-list. With the current GCCs, this bug is the only remaining one I could detect. Alternatively it might be worth splitting the vld1q_*64 case into a 64 bit load into a (subreg:DI (V2DI reg) 0 ) followed by a subreg to subreg move which should end up having the same effect . That splitting would allow for better instruction scheduling. Are you aware of examples of similar cases I could use as a model? In addition it would be nice to have a testcase in gcc.target/arm . Well. Prior to sending my patch I did look at that directory, but I supposed that such a test ought to belong to the neon/ subdir where the tests are described as autogenerated. Any doc on how to do that? I'd recommend not to autogenerate such a test, FWIW -- the autogenerated neon tests aren't very good. I think a manually-written execute test would be better in this case. If you do try autogenerating tests, look at Disassembles_as in neon.ml, and neon-testgen.ml. Julian
[Dwarf Patch] Improve pubnames and pubtypes generation. (issue6197069)
The enclosed patch fixes many issues with pubnames and pubtypes. It generates them for many more variables and with mostly correct and canonical dwarf names. This patch should not affect any target that does not use pubnames. The exceptions to the canonical names are addressed in a separate patch in to the front end under review at http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00512.html. Tested with bootstrap and running the test_pubnames_and_indices.py script recently contributed to the GDB project. OK for mainline? Sterling 2012-05-10 Sterling Augustine saugust...@google.com * dwarf2out.c (DEBUG_PUBNAMES_SECTION_LABEL, DEBUG_PUBTYPES_SECTION_LABEL): New macros. (debug_pubnames_section_label, debug_pubtypes_section_label): New globals. (is_cu_die, is_namespace_die, is_class_die, add_AT_pubnames, add_enumerator_pubname): New functions. (add_pubname): Rework logic. Call is_class_die, is_cu_die and is_namespace_die. Fix minor style violation. (add_pubtype): Rework logic for calculating type name. Call is_namespace_die. (output_pubnames): Move conditional logic deciding when to produce the section from dwarf2out_finish. Output debug_pubnames_section_label and debug_pubtypes_section_label. (base_type_die): Call add_pubtype. (gen_enumeration_type_die): Unconditionally call add_pubtype. (gen_namespace_die): Call add_pubname_string. (dwarf2out_init): Generate debug_pubnames_section_label and debug_pubtypes_section_label from DEBUG_PUBNAMES_SECTION_LABEL and DEBUG_PUBTYPES_SECTION_LABEL respectively. (dwarf2out_finish): Call add_AT_pubnames; Move logic on when to produce pubnames and pubtypes sections to output_pubnames. Index: gcc/dwarf2out.c === --- gcc/dwarf2out.c (revision 187271) +++ gcc/dwarf2out.c (working copy) @@ -3007,6 +3007,7 @@ static void output_comp_unit (dw_die_ref, int); static void output_comdat_type_unit (comdat_type_node *); static const char *dwarf2_name (tree, int); static void add_pubname (tree, dw_die_ref); +static void add_enumerator_pubname (const char *, dw_die_ref); static void add_pubname_string (const char *, dw_die_ref); static void add_pubtype (tree, dw_die_ref); static void output_pubnames (VEC (pubname_entry,gc) *); @@ -3210,6 +3211,12 @@ static void gen_scheduled_generic_parms_dies (void #ifndef COLD_TEXT_SECTION_LABEL #define COLD_TEXT_SECTION_LABEL Ltext_cold #endif +#ifndef DEBUG_PUBNAMES_SECTION_LABEL +#define DEBUG_PUBNAMES_SECTION_LABEL Ldebug_pubnames +#endif +#ifndef DEBUG_PUBTYPES_SECTION_LABEL +#define DEBUG_PUBTYPES_SECTION_LABEL Ldebug_pubtypes +#endif #ifndef DEBUG_LINE_SECTION_LABEL #define DEBUG_LINE_SECTION_LABEL Ldebug_line #endif @@ -3246,6 +3253,8 @@ static char cold_end_label[MAX_ARTIFICIAL_LABEL_BY static char abbrev_section_label[MAX_ARTIFICIAL_LABEL_BYTES]; static char debug_info_section_label[MAX_ARTIFICIAL_LABEL_BYTES]; static char debug_line_section_label[MAX_ARTIFICIAL_LABEL_BYTES]; +static char debug_pubnames_section_label[MAX_ARTIFICIAL_LABEL_BYTES]; +static char debug_pubtypes_section_label[MAX_ARTIFICIAL_LABEL_BYTES]; static char macinfo_section_label[MAX_ARTIFICIAL_LABEL_BYTES]; static char loc_section_label[MAX_ARTIFICIAL_LABEL_BYTES]; static char ranges_section_label[2 * MAX_ARTIFICIAL_LABEL_BYTES]; @@ -5966,6 +5975,22 @@ is_cu_die (dw_die_ref c) return c c-die_tag == DW_TAG_compile_unit; } +/* Returns true iff C is a namespace DIE. */ + +static inline bool +is_namespace_die (dw_die_ref c) +{ + return c c-die_tag == DW_TAG_namespace; +} + +/* Returns true iff C is a class DIE. */ + +static inline bool +is_class_die (dw_die_ref c) +{ + return c c-die_tag == DW_TAG_class_type; +} + static char * gen_internal_sym (const char *prefix) { @@ -8033,6 +8058,20 @@ output_comp_unit (dw_die_ref die, int output_if_em } } +/* Add the DW_AT_GNU_pubnames and DW_AT_GNU_pubtypes attributes. */ + +static void +add_AT_pubnames (dw_die_ref die) +{ + if (targetm.want_debug_pub_sections) +{ + /* FIXME: Should use add_AT_pubnamesptr. This works because most targets + don't care what the base section is. */ + add_AT_lineptr (die, DW_AT_GNU_pubnames, debug_pubnames_section_label); + add_AT_lineptr (die, DW_AT_GNU_pubtypes, debug_pubtypes_section_label); +} +} + /* Output a comdat type unit DIE and its children. */ static void @@ -8116,14 +8155,32 @@ add_pubname_string (const char *str, dw_die_ref di static void add_pubname (tree decl, dw_die_ref die) { - if (targetm.want_debug_pub_sections TREE_PUBLIC (decl)) + if (!targetm.want_debug_pub_sections) +return; + + if ((TREE_PUBLIC (decl) !is_class_die (die-die_parent)) + || is_cu_die (die-die_parent) || is_namespace_die (die-die_parent)) {
Re: [PATCH] Remove TYPE_IS_SIZETYPE
For example Index: stor-layout.c === --- stor-layout.c (revision 187364) +++ stor-layout.c (working copy) @@ -791,6 +791,10 @@ start_record_layout (tree t) tree bit_from_pos (tree offset, tree bitpos) { + if (TREE_CODE (offset) == PLUS_EXPR) +offset = size_binop (PLUS_EXPR, + fold_convert (bitsizetype, TREE_OPERAND (offset, 0)), + fold_convert (bitsizetype, TREE_OPERAND (offset, 1))); return size_binop (PLUS_EXPR, bitpos, size_binop (MULT_EXPR, fold_convert (bitsizetype, offset), fixes the specific testcase you provided. I get a bootstrap failure on x86 (verify_flow_info failed) with it. Let's drop it for now, we'll revisit this later. I suppose if stor-layout.c would be more carefully handle advancing offset/bitpos, avoding repeated translations between them, those issues would not exist. Of course the mere existence of DECL_OFFSET_ALIGN complicates matters for no good reasons (well, at least I did not find a good use of it until now ...). Maybe it's also obsolete by now. -- Eric Botcazou
Re: [PATCH] Add option for dumping to stderr (issue6190057)
I like your suggestion and support the end goal you have. I don't like the -fopt-info behavior to interfere with regular -fdump-xxx options either. I think we should stage the changes in multiple steps as originally planned. Is Sharad's change good to be checked in for the first stage? After this one is checked in, the new dump interfaces will be worked on (and to allow multiple streams). Most of the remaining changes will be massive text replacement. thanks, David On Thu, May 10, 2012 at 1:18 AM, Richard Guenther richard.guent...@gmail.com wrote: On Thu, May 10, 2012 at 2:31 AM, Xinliang David Li davi...@google.com wrote: Bummer. I was thinking to reserve '=' for selective dumping: -fdump-tree-pre=func_list_regexp I guess this can be achieved via @ -fdump-tree-pre@func_list -fdump-tree-pre=file_name@func_list Another issue -- I don't think the current precedence rule is correct. Consider that -fopt-info=2 will be mapped to -fdump-tree-all-transform-verbose2=stderr -fdump-rtl-all-transform-verbose2=stderr then the current precedence rule will cause surprise when the following is used -fopt-info -fdump-tree-pre The PRE dump will be emitted to stderr which is not what user wants. In short, special streams should be treated as 'weak' the same way as your previous implementation. Hm, this raises a similar concern I have with the -fvectorizer-verbose flag. With -fopt-info -fdump-tree-pre I do not want some information to be present only on stderr or in the dump file! I want it in _both_ places! (-fvectorizer-verbose makes the -fdump-tree-vect dump contain less information :() Thus, the information where dumping goes has to be done differently (which is why I asked for some re-org originally, so that passes no longer explicitely reference dump_file - dump_file may be different for different kind of information it dumps!). Passes should, instead of fprintf (dump_file, ..., ...) do dump_printf (TDF_scev, ..., ...) thus, specify the kind of information they dump (would be mostly TDF_details vs. 0 today I guess). The dump_printf routine would then properly direct to one or more places to dump at. I realize this needs some more dispatchers for dumping expressions and statements (but it should not be too many). Dumping to dump_file would in any case dump to the passes private dump file only (unqualified stuff would never be useful for -fopt-info). The perfect candidate to convert to this kind of scheme is obviously the vectorizer with its existing -fvectorizer-verbose. If the patch doesn't work towards this kind of end-result I'd rather not have it. Thanks, Richard. thanks, David On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai sing...@google.com wrote: Thanks for your suggestions/comments. I have updated the patch and documentation. It supports the following usage: gcc -fdump-tree-all=tree.dump -fdump-tree-pre=stdout -fdump-rtl-ira=ira.dump Here all tree dumps except the PRE are output into tree.dump, PRE dump goes to stdout and the IRA dump goes to ira.dump. Thanks, Sharad 2012-05-09 Sharad Singhai sing...@google.com * doc/invoke.texi: Add documentation for the new option. * tree-dump.c (dump_get_standard_stream): New function. (dump_files): Update for new field. (dump_switch_p_1): Handle dump filenames. (dump_begin): Likewise. (get_dump_file_name): Likewise. (dump_end): Remove attribute. (dump_enable_all): Add new parameter FILENAME. All callers updated. (enable_rtl_dump_file): * tree-pass.h (enum tree_dump_index): Add new constant. (struct dump_file_info): Add new field FILENAME. * testsuite/g++.dg/other/dump-filename-1.C: New test. Index: doc/invoke.texi === --- doc/invoke.texi (revision 187265) +++ doc/invoke.texi (working copy) @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio @item -d@var{letters} @itemx -fdump-rtl-@var{pass} +@itemx -fdump-rtl-@var{pass}=@var{filename} @opindex d Says to make debugging dumps during compilation at times specified by @var{letters}. This is used for debugging the RTL-based passes of the compiler. The file names for most of the dumps are made by appending a pass number and a word to the @var{dumpname}, and the files are -created in the directory of the output file. Note that the pass -number is computed statically as passes get registered into the pass -manager. Thus the numbering is not related to the dynamic order of -execution of passes. In particular, a pass installed by a plugin -could have a number over 200 even if it executed quite early. -@var{dumpname} is generated from the name of the output file, if -explicitly specified and it is not an executable, otherwise it is the -basename of the source file. These switches may have different
[PATCH, 4.7] Backport fix to [un]signed_type_for
Backporting this patch to 4.7 fixes a problem building Fedora 17. Bootstrapped and regression tested on powerpc64-unknown-linux-gnu. Is the backport OK? Thanks, Bill 2012-05-10 Bill Schmidt wschm...@vnet.linux.ibm.com Backport from trunk: 2012-03-12 Richard Guenther rguent...@suse.de * tree.c (signed_or_unsigned_type_for): Use build_nonstandard_integer_type. (signed_type_for): Adjust documentation. (unsigned_type_for): Likewise. * tree-pretty-print.c (dump_generic_node): Use standard names for non-standard integer types if available. Index: gcc/tree-pretty-print.c === --- gcc/tree-pretty-print.c (revision 187368) +++ gcc/tree-pretty-print.c (working copy) @@ -723,11 +723,41 @@ dump_generic_node (pretty_printer *buffer, tree no } else if (TREE_CODE (node) == INTEGER_TYPE) { - pp_string (buffer, (TYPE_UNSIGNED (node) - ? unnamed-unsigned: - : unnamed-signed:)); - pp_decimal_int (buffer, TYPE_PRECISION (node)); - pp_string (buffer, ); + if (TYPE_PRECISION (node) == CHAR_TYPE_SIZE) + pp_string (buffer, (TYPE_UNSIGNED (node) + ? unsigned char + : signed char)); + else if (TYPE_PRECISION (node) == SHORT_TYPE_SIZE) + pp_string (buffer, (TYPE_UNSIGNED (node) + ? unsigned short + : signed short)); + else if (TYPE_PRECISION (node) == INT_TYPE_SIZE) + pp_string (buffer, (TYPE_UNSIGNED (node) + ? unsigned int + : signed int)); + else if (TYPE_PRECISION (node) == LONG_TYPE_SIZE) + pp_string (buffer, (TYPE_UNSIGNED (node) + ? unsigned long + : signed long)); + else if (TYPE_PRECISION (node) == LONG_LONG_TYPE_SIZE) + pp_string (buffer, (TYPE_UNSIGNED (node) + ? unsigned long long + : signed long long)); + else if (TYPE_PRECISION (node) = CHAR_TYPE_SIZE + exact_log2 (TYPE_PRECISION (node))) + { + pp_string (buffer, (TYPE_UNSIGNED (node) ? uint : int)); + pp_decimal_int (buffer, TYPE_PRECISION (node)); + pp_string (buffer, _t); + } + else + { + pp_string (buffer, (TYPE_UNSIGNED (node) + ? unnamed-unsigned: + : unnamed-signed:)); + pp_decimal_int (buffer, TYPE_PRECISION (node)); + pp_string (buffer, ); + } } else if (TREE_CODE (node) == COMPLEX_TYPE) { Index: gcc/tree.c === --- gcc/tree.c (revision 187368) +++ gcc/tree.c (working copy) @@ -10162,32 +10162,26 @@ widest_int_cst_value (const_tree x) return val; } -/* If TYPE is an integral type, return an equivalent type which is -unsigned iff UNSIGNEDP is true. If TYPE is not an integral type, -return TYPE itself. */ +/* If TYPE is an integral or pointer type, return an integer type with + the same precision which is unsigned iff UNSIGNEDP is true, or itself + if TYPE is already an integer type of signedness UNSIGNEDP. */ tree signed_or_unsigned_type_for (int unsignedp, tree type) { - tree t = type; - if (POINTER_TYPE_P (type)) -{ - /* If the pointer points to the normal address space, use the -size_type_node. Otherwise use an appropriate size for the pointer -based on the named address space it points to. */ - if (!TYPE_ADDR_SPACE (TREE_TYPE (t))) - t = size_type_node; - else - return lang_hooks.types.type_for_size (TYPE_PRECISION (t), unsignedp); -} + if (TREE_CODE (type) == INTEGER_TYPE TYPE_UNSIGNED (type) == unsignedp) +return type; - if (!INTEGRAL_TYPE_P (t) || TYPE_UNSIGNED (t) == unsignedp) -return t; + if (!INTEGRAL_TYPE_P (type) + !POINTER_TYPE_P (type)) +return NULL_TREE; - return lang_hooks.types.type_for_size (TYPE_PRECISION (t), unsignedp); + return build_nonstandard_integer_type (TYPE_PRECISION (type), unsignedp); } -/* Returns unsigned variant of TYPE. */ +/* If TYPE is an integral or pointer type, return an integer type with + the same precision which is unsigned, or itself if TYPE is already an + unsigned integer type. */ tree unsigned_type_for
Re: [PATCH, 4.7] Backport fix to [un]signed_type_for
On Thu, May 10, 2012 at 11:44:27AM -0500, William J. Schmidt wrote: Backporting this patch to 4.7 fixes a problem building Fedora 17. Bootstrapped and regression tested on powerpc64-unknown-linux-gnu. Is the backport OK? For 4.7 I'd very much prefer a less intrusive change (i.e. change the java langhook) instead, but I'll defer to Richard if he prefers this over that. 2012-05-10 Bill Schmidt wschm...@vnet.linux.ibm.com Backport from trunk: 2012-03-12 Richard Guenther rguent...@suse.de * tree.c (signed_or_unsigned_type_for): Use build_nonstandard_integer_type. (signed_type_for): Adjust documentation. (unsigned_type_for): Likewise. * tree-pretty-print.c (dump_generic_node): Use standard names for non-standard integer types if available. Jakub
Re: [PATCH, 4.7] Backport fix to [un]signed_type_for
On Thu, 2012-05-10 at 18:49 +0200, Jakub Jelinek wrote: On Thu, May 10, 2012 at 11:44:27AM -0500, William J. Schmidt wrote: Backporting this patch to 4.7 fixes a problem building Fedora 17. Bootstrapped and regression tested on powerpc64-unknown-linux-gnu. Is the backport OK? For 4.7 I'd very much prefer a less intrusive change (i.e. change the java langhook) instead, but I'll defer to Richard if he prefers this over that. OK. If that's desired, this is the possible change to the langhook: Index: gcc/java/typeck.c === --- gcc/java/typeck.c (revision 187158) +++ gcc/java/typeck.c (working copy) @@ -189,6 +189,12 @@ java_type_for_size (unsigned bits, int unsignedp) return unsignedp ? unsigned_int_type_node : int_type_node; if (bits = TYPE_PRECISION (long_type_node)) return unsignedp ? unsigned_long_type_node : long_type_node; + /* A 64-bit target with TImode requires 128-bit type definitions + for bitsizetype. */ + if (int128_integer_type_node + bits == TYPE_PRECISION (int128_integer_type_node)) +return (unsignedp ? int128_unsigned_type_node + : int128_integer_type_node); return 0; } which also fixed the problem and bootstraps without regressions. Whichever you guys prefer is fine with me. Thanks, Bill 2012-05-10 Bill Schmidt wschm...@vnet.linux.ibm.com Backport from trunk: 2012-03-12 Richard Guenther rguent...@suse.de * tree.c (signed_or_unsigned_type_for): Use build_nonstandard_integer_type. (signed_type_for): Adjust documentation. (unsigned_type_for): Likewise. * tree-pretty-print.c (dump_generic_node): Use standard names for non-standard integer types if available. Jakub
Re: [h8300] increase dwarf address size
Regardless, shouldn't DWARF2_ADDR_SIZE be POINTER_SIZE / BITS_PER_UNIT? That's the default. It doesn't work because pointers are still 16 bits.
Re: PR 43772 Errant -Wlogical-op warning when testing limits
Sorry for my late reply. Manuel López-Ibáñez lopeziba...@gmail.com writes: This patch fixes almost all false positives in PR43772. The case not fixed is: intmax_t i = (whatever); if (INT_MAX i i = LONG_MAX) print (i is in 'long' but not 'int' ran) where we warn if INT_MAX = LONG_MAX INTMAX_MAX. FWIW, I'd be inclined to warn in that case, unless someone comes with a reasonable scenario that argues for the usefulness of not warning here. Maybe I am missing something. Perhaps with the macro location code, we could now tell that the constants INT_MAX and LONG_MAX come from different macro expansions in system headers, and avoid warning in this specific case, but that would be better done in a follow-up patch. Hmmh. Dodji, is that possible? how could it be done? It might be possible, even if I doubt if the value of doing that really offsets the cost the perceived weirdness of the approach. Assuming the token for resulting from INT_MAX and LONG_MAX haven't been folded, you could get the line maps of their locations by using linemap_lookup (line_table, location_of_token). Then, make if linemap_macro_expansion_map_p is true on the the two line maps, it means the tokens for INT_MAX and LONG_MAX come from macro expansions. Then if the maps are different, it means the macro expansions are different. To know if they (the macros) come from system headers, you can use the predicate LINEMAP_SYSP on them. But if INT_MAX/LONG_MAX are folded into a constant, then the information about their macro-ness is lost, unfortunately. -- Dodji
Re: User directed Function Multiversioning via Function Overloading (issue5752064)
On Wed, May 9, 2012 at 12:01 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, Attached new patch with more bug fixes. I will fix the dispatching method to use prioirty of attributes in the next iteration. Patch also available for review here: http://codereview.appspot.com/5752064 The patch looks OK to me. Since testcase depends on the dispatching method, I'd like to see the whole patch with the updated dispatching method. Thanks. -- H.J.
Re: [MIPS] Fix misspelled macro in t-vxworks
Mingjie Xing mingjie.x...@gmail.com writes: This patch fix the misspelled macro in t-vxworks. Is it OK? 2012-05-10 Mingjie Xing mingjie.x...@gmail.com * config/mips/t-vxworks: Change MUTLILIB_EXTRA_OPTS to MULTILIB_EXTRA_OPTS. OK, thanks. Richard
[PATCH, i386]: Further move insn modes cleanup.
Hello! Introduce handling of TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL and TARGET_SSE_TYPELESS_STORES flags to movoi, movti and movtf move patterns. Also introduce ssePSmode attribute to determine PSmode at compile time. 2012-05-10 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (*movoi_internal_avx): Handle TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL and TARGET_SSE_TYPELESS_STORES. (*movti_internal_rex64): Handle TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL. (*movti_internal_sse): Ditto. (*movtf_internal): Ditto. * config/i386/sse.md (ssePSmode): New mode attribute. (*movemode_internal): Use ssePSmode. (*sse_movussemodesuffixavxsizesuffix): Ditto. (*sse2_movdquavxsizesuffix): Ditto. * config/i386/i386.c (standard_sse_constant_opcode): Do not handle TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL here. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: config/i386/sse.md === --- config/i386/sse.md (revision 187354) +++ config/i386/sse.md (working copy) @@ -337,6 +337,16 @@ (V8SF V4SF) (V4DF V2DF) (V4SF V2SF)]) +;; Mapping of vector modes ti packed single mode of the same size +(define_mode_attr ssePSmode + [(V32QI V8SF) (V16QI V4SF) + (V16HI V8SF) (V8HI V4SF) + (V8SI V8SF) (V4SI V4SF) + (V4DI V8SF) (V2DI V4SF) + (V2TI V8SF) (V1TI V4SF) + (V8SF V8SF) (V4SF V4SF) + (V4DF V8SF) (V2DF V4SF)]) + ;; Mapping of vector modes back to the scalar modes (define_mode_attr ssescalarmode [(V32QI QI) (V16HI HI) (V8SI SI) (V4DI DI) @@ -420,7 +430,7 @@ }) (define_insn *movmode_internal - [(set (match_operand:V16 0 nonimmediate_operand =x,x ,m) + [(set (match_operand:V16 0 nonimmediate_operand =x,x ,m) (match_operand:V16 1 nonimmediate_or_sse_const_operand C ,xm,x))] TARGET_SSE (register_operand (operands[0], MODEmode) @@ -471,21 +481,18 @@ [(set_attr type sselog1,ssemov,ssemov) (set_attr prefix maybe_vex) (set (attr mode) - (cond [(and (eq_attr alternative 1,2) - (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)) -(if_then_else - (match_test GET_MODE_SIZE (MODEmode) 16) - (const_string V8SF) - (const_string V4SF)) + (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) +(const_string ssePSmode) + (and (eq_attr alternative 2) + (match_test TARGET_SSE_TYPELESS_STORES)) +(const_string ssePSmode) (match_test TARGET_AVX) (const_string sseinsnmode) - (ior (and (eq_attr alternative 1,2) -(match_test optimize_function_for_size_p (cfun))) - (and (eq_attr alternative 2) -(match_test TARGET_SSE_TYPELESS_STORES))) + (ior (not (match_test TARGET_SSE2)) + (match_test optimize_function_for_size_p (cfun))) (const_string V4SF) ] - (const_string sseinsnmode)))]) + (const_string sseinsnmode)))]) (define_insn sse2_movq128 [(set (match_operand:V2DI 0 register_operand =x) @@ -610,18 +617,16 @@ (set_attr prefix maybe_vex) (set (attr mode) (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) -(if_then_else - (match_test GET_MODE_SIZE (MODEmode) 16) - (const_string V8SF) - (const_string V4SF)) +(const_string ssePSmode) + (and (eq_attr alternative 1) + (match_test TARGET_SSE_TYPELESS_STORES)) +(const_string ssePSmode) (match_test TARGET_AVX) (const_string MODE) - (ior (match_test optimize_function_for_size_p (cfun)) - (and (eq_attr alternative 1) -(match_test TARGET_SSE_TYPELESS_STORES))) -(const_string V4SF) + (match_test optimize_function_for_size_p (cfun)) +(const_string V4SF) ] - (const_string MODE)))]) + (const_string MODE)))]) (define_expand sse2_movdquavxsizesuffix [(set (match_operand:VI1 0 nonimmediate_operand) @@ -658,18 +663,16 @@ (set_attr prefix maybe_vex) (set (attr mode) (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) -(if_then_else - (match_test GET_MODE_SIZE (MODEmode) 16) - (const_string V8SF) - (const_string V4SF)) +(const_string ssePSmode) + (and (eq_attr alternative 1) + (match_test TARGET_SSE_TYPELESS_STORES)) +(const_string ssePSmode) (match_test TARGET_AVX) (const_string sseinsnmode) - (ior (match_test
patch for PR53125
The following patch is for PR53125. The PR is described on http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53125. The patch improves the compilation speed by 35% for the case. The patch was successfully bootstrapped on x86-64. Committed as rev. 187373. 2012-05-10 Vladimir Makarov vmaka...@redhat.com PR rtl-optimization/53125 * ira.c (ira): Call find_moveable_pseudos and move_unallocated_pseudos if only ira_conflicts_p is true.
[i386] New testcase (was: [rtl, patch] combine concat+shuffle)
Hello, could an i386 maintainer take a look at the following testcase? gcc/testsuite/ChangeLog 2012-05-08 Marc Glisse marc.gli...@inria.fr * gcc.target/i386/shuf-concat.c: New test. --- gcc.target/i386/shuf-concat.c (revision 0) +++ gcc.target/i386/shuf-concat.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O -msse2 -mfpmath=sse } */ + +typedef double v2df __attribute__ ((__vector_size__ (16))); + +v2df f(double d,double e){ + v2df x={-d,d}; + v2df y={-e,e}; + return __builtin_ia32_shufpd(x,y,1); +} + +/* { dg-final { scan-assembler-not \tv?shufpd\t } } */ +/* { dg-final { scan-assembler-times \tv?unpcklpd\t 1 } } */ The conversation on this patch started at http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00504.html On Tue, 8 May 2012, Marc Glisse wrote: On Tue, 8 May 2012, Richard Sandiford wrote: Marc Glisse marc.gli...@inria.fr writes: Here is a new version. gcc/ChangeLog 2012-05-08 Marc Glisse marc.gli...@inria.fr * simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle of concatenations. OK, thanks. I'll leave an x86 maintainer to review the testcase, but it looks like it'll need some markup to ensure an SSE target. Oups, I'd thought about that, then completely forgot. For 64 bits, it always works. For 32 bits, it requires -msse2 -mfpmath=sse (without -mfpmath=sse we can still test for shufpd, but apparently not unpcklpd, I could remove that second test if people prefer, as it isn't important). Since this is a compile-only test, I think this would be enough: /* { dg-options -O -msse2 -mfpmath=sse } */ Note to self: if you want to grep for shuf in the asm, don't put shuf in the name of the file... Yeah :-) For MIPS tests I tend to add \t to the beginning of the regexp. (And to the end if possible.) Good idea. I was trying to make the check as wide as possible, but that's not so useful. Attached a new version of the testcase. -- Marc Glisse
Re: Symbol table 20/many: cleanup of cgraph_remove_unreachable_nodes
Hi, after some thought, the changes into omp-low are not as obviously harmless as I originally tought. So i decided to handle this by separate patch. This patch simply makes cgraph to not release bodies of artificial functions that papers around the problem in easier way. Bootstrapped/regtested x86_64-linux, comitted. Honza * cgraph.h (cgraph_remove_unreachable_nodes): Rename to ... (symtab_remove_unreachable_nodes): ... this one. * ipa-cp.c (ipcp_driver): Do not remove unreachable nodes. * cgraphunit.c (ipa_passes): Update. * cgraphclones.c (cgraph_materialize_all_clones): Update. * cgraph.c (cgraph_release_function_body): Only turn initial into error mark when initial was previously set. * ipa-inline.c (ipa_inline): Update. * ipa.c: Include ipa-inline.h (enqueue_cgraph_node, enqueue_varpool_node): Remove. (enqueue_node): New function. (process_references): Update. (symtab_remove_unreachable_nodes): Cleanup. * passes.c (execute_todo, execute_one_pass): Update. Index: cgraph.c === *** cgraph.c(revision 187335) --- cgraph.c(working copy) *** cgraph_release_function_body (struct cgr *** 1162,1168 /* If the node is abstract and needed, then do not clear DECL_INITIAL of its associated function function declaration because it's needed to emit debug info later. */ ! if (!node-abstract_and_needed) DECL_INITIAL (node-symbol.decl) = error_mark_node; } --- 1162,1168 /* If the node is abstract and needed, then do not clear DECL_INITIAL of its associated function function declaration because it's needed to emit debug info later. */ ! if (!node-abstract_and_needed DECL_INITIAL (node-symbol.decl)) DECL_INITIAL (node-symbol.decl) = error_mark_node; } Index: cgraph.h === *** cgraph.h(revision 187335) --- cgraph.h(working copy) *** int compute_call_stmt_bb_frequency (tree *** 637,643 void record_references_in_initializer (tree, bool); /* In ipa.c */ ! bool cgraph_remove_unreachable_nodes (bool, FILE *); cgraph_node_set cgraph_node_set_new (void); cgraph_node_set_iterator cgraph_node_set_find (cgraph_node_set, struct cgraph_node *); --- 637,643 void record_references_in_initializer (tree, bool); /* In ipa.c */ ! bool symtab_remove_unreachable_nodes (bool, FILE *); cgraph_node_set cgraph_node_set_new (void); cgraph_node_set_iterator cgraph_node_set_find (cgraph_node_set, struct cgraph_node *); Index: ipa-cp.c === *** ipa-cp.c(revision 187335) --- ipa-cp.c(working copy) *** ipcp_driver (void) *** 2445,2451 struct cgraph_2edge_hook_list *edge_duplication_hook_holder; struct topo_info topo; - cgraph_remove_unreachable_nodes (true,dump_file); ipa_check_create_node_params (); ipa_check_create_edge_args (); grow_next_edge_clone_vector (); --- 2445,2450 Index: cgraphunit.c === *** cgraphunit.c(revision 187335) --- cgraphunit.c(working copy) *** ipa_passes (void) *** 1836,1842 because TODO is run before the subpasses. It is important to remove the unreachable functions to save works at IPA level and to get LTO symbol tables right. */ ! cgraph_remove_unreachable_nodes (true, cgraph_dump_file); /* If pass_all_early_optimizations was not scheduled, the state of the cgraph will not be properly updated. Update it now. */ --- 1836,1842 because TODO is run before the subpasses. It is important to remove the unreachable functions to save works at IPA level and to get LTO symbol tables right. */ ! symtab_remove_unreachable_nodes (true, cgraph_dump_file); /* If pass_all_early_optimizations was not scheduled, the state of the cgraph will not be properly updated. Update it now. */ *** compile (void) *** 1962,1968 /* This pass remove bodies of extern inline functions we never inlined. Do this later so other IPA passes see what is really going on. */ ! cgraph_remove_unreachable_nodes (false, dump_file); cgraph_global_info_ready = true; if (cgraph_dump_file) { --- 1962,1968 /* This pass remove bodies of extern inline functions we never inlined. Do this later so other IPA passes see what is really going on. */ ! symtab_remove_unreachable_nodes (false, dump_file); cgraph_global_info_ready = true; if (cgraph_dump_file) { *** compile (void) *** 1987,1993
Re: [PATCH, i386] V4DF __builtin_shuffle
Any comment? On Mon, 30 Apr 2012, Marc Glisse wrote: Ping? http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01034.html Since then, I've run a c,c++ bootstrap and: make -k check RUNTESTFLAGS=--target_board=my-sde-sim where my-sde-sim is the dejagnu board posted by H.J. Lu to run tests inside Intel's simulator, no difference between before and after my patch. (If I understand correctly, the testsuite always compiles the AVX and AVX2 tests, and uses cpuid (which I expect the simulator must fake) to determine if it should run them, so I don't need to pass any extra flag in RUNTESTFLAGS. If I am wrong, please tell me.) Adding in Cc: the 2 people who kindly commented on the other shuffle patch (the one that isn't finished). On Tue, 17 Apr 2012, Marc Glisse wrote: Hello, this patch expands __builtin_shuffle for V4DF mode in at most 3 insn. It is simple and works really well, often generates only 2 insn. It is not very generic, because other modes don't have an instruction equivalent to vshufpd. For V8SF (and likely V4DI and V8SI with AVX2, but I still need to do that), my patch default case in PR 52607 seems more interesting. I tried calling this new function after expand_vec_perm_vperm2f128_vblend (instead of before as in the patch), but it generated more instructions for some permutations, and never less. That function is still useful for V8SF though. I bootstrapped gcc on a non-avx platform, compiled a program that tests all 4096 shuffles with -mavx/-mavx2, and ran the result using Intel's emulator (SDE). There are still a few V4DF permutations that don't generate an optimal sequence (3 insn instead of 2), but not that many I think. Of course, I am assuming a constant cost of 1 per insn, which is completely false, but seems like a sensible first approximation. (note that I can't commit) 2012-04-17 Marc Glisse marc.gli...@inria.fr PR target/502607 * config/i386/i386.c (ix86_expand_vec_perm_const): Move code to ... (canonicalize_perm): ... new function. (expand_vec_perm_2vperm2f128_vshuf): New function. (ix86_expand_vec_perm_const_1): Call it. -- Marc Glisse
Re: Missing guard in ira-color.c ?
On 05/10/2012 09:10 AM, Tristan Gingold wrote: Hi, I am getting a segfault in ira-color.c:2945 on the trunk: Program received signal SIGSEGV, Segmentation fault. 0x00a79f37 in move_spill_restore () at ../../src/gcc/ira-color.c:2945 2945 || ira_reg_equiv_const[regno] != NULL_RTX (gdb) l 2940 /* don't do the optimization because it can create 2941 copies and the reload pass can spill the allocno set 2942 by copy although the allocno will not get memory 2943 slot. */ 2944 || ira_reg_equiv_invariant_p[regno] 2945 || ira_reg_equiv_const[regno] != NULL_RTX 2946 || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM (a))) 2947continue; 2948 mode = ALLOCNO_MODE (a); 2949 rclass = ALLOCNO_CLASS (a); while building gcc (gnatcmd.adb file) for ia64-vms using a cross compiler (target=ia64-vms, host=x86_64-linux). The reason looks to be an out of bounds access: (gdb) print regno $10 = 18476 (gdb) print ira_reg_equiv_len $11 = 17984 (I suppose this setup is not easy at all to reproduce, but I can provide any files, if necessary). Tristan, thanks for reporting this. Wild guess, as I don't know IRA at all: looks like in this file most accesses to ira_reg_equiv_* are guarded. Is it expected that they aren't at this point ? Yes, I guess. It is possible to have the pseudos which are out of range ira_reg_equiv_const. It should be hard to reproduce such error because they are generated when we need to break circular dependence (e.g. when hard register 1 should be moved to hard register 2 and hard register 2 to hard register 1). Your solution is perfectly fine. So you can commit the patch into the trunk as pre-approved. Thanks again. [I am currently trying with the following chunk: --- a/gcc/ira-color.c +++ b/gcc/ira-color.c @@ -2941,8 +2941,9 @@ move_spill_restore (void) copies and the reload pass can spill the allocno set by copy although the allocno will not get memory slot. */ - || ira_reg_equiv_invariant_p[regno] - || ira_reg_equiv_const[regno] != NULL_RTX + || (regno ira_reg_equiv_len + (ira_reg_equiv_invariant_p[regno] + || ira_reg_equiv_const[regno] != NULL_RTX)) || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM (a))) continue; mode = ALLOCNO_MODE (a); ]
Re: [ping] 3 pending patches
On 05/08/2012 01:29 AM, Eric Botcazou wrote: Fix debug info of nested inline functions: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00161.html I'll leave this one for Jason. Emit variable as size attribute in debug info: http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00422.html Implement static stack checking on IA-64: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00452.html Both ok. r~
Re: [C++ Patch] PR 53301
Hi, Yes, please. It feels as if the names are based more on the underlying implementation of the macro than on anything else. Also, short names are nice, but using MEM instead of MEMBER is a bit too short. The same for OB for object and others. PTR_OR_PTRMEM sounds to me like pointer or pointer to member, which sounds redundant since a pointer to member is a pointer already. And there is also TYPE_PTRMEM_P and TYPE_PTR_TO_MEMBER_P. From the names it is not clear what is the difference. This could be TYPE_PTR_TO_DATA_MEMBER and TYPE_PTR_TO_ANY_MEMBER. The few extra chars help a lot to clarify the meaning. Let's see if we can do something *now* ;) My concrete proposal would be: TYPE_PTRMEM_P rename to TYPE_PTRDATAMEM_P (consistent with TYPE_PTRMEMFUNC_P) TYPE_PTR_TO_MEMBER_P rename to TYPE_PTRMEM_P and then finally #define TYPE_PTR_OR_PTRMEM_P(NODE) \ (TYPE_PTR_P (NODE) || TYPE_PTRMEM_P (NODE)) and use it everywhere. Sounds like an improvement? Additionally, we could maybe rename PTRMEM_OK_P to PTRMEMFUNC_OK_P Thanks, Paolo.
Fix Mozilla LTO build
Hi, Mozilla LTO build broke due because symtab_remove_unreachable_nodes incorrectly removes origins of clones in some special cases. Bootstrapped/regtested x97_64-linux and comitted. Index: ipa.c === --- ipa.c (revision 187375) +++ ipa.c (working copy) @@ -310,12 +310,12 @@ symtab_remove_unreachable_nodes (bool be /* For non-inline clones, force their origins to the boundary and ensure that body is not removed. */ - while (cnode-clone_of !cnode-clone_of-symbol.aux + while (cnode-clone_of !gimple_has_body_p (cnode-symbol.decl)) { bool noninline = cnode-clone_of-symbol.decl != cnode-symbol.decl; cnode = cnode-clone_of; - if (noninline !cnode-symbol.aux) + if (noninline) { pointer_set_insert (body_needed_for_clonning, cnode-symbol.decl); enqueue_node ((symtab_node)cnode, first, reachable);
Speed up inliner
Hi, this patch cuts 10 minutes of Mozilla compilation time that is spent by updating keys. After Richi's removal of overall growth from the cost functions, we no longer need to update that much. Bootstrapped/regtested x86_64-linux and tested on Mozilla build. Comitted. Honza * ipa-inline.c (update_all_callee_keys): Remove. (inline_small_functions): Simplify priority updating. Index: ipa-inline.c === --- ipa-inline.c(revision 187375) +++ ipa-inline.c(working copy) @@ -1097,45 +1097,6 @@ update_callee_keys (fibheap_t heap, stru } } -/* Recompute heap nodes for each of caller edges of each of callees. - Walk recursively into all inline clones. */ - -static void -update_all_callee_keys (fibheap_t heap, struct cgraph_node *node, - bitmap updated_nodes) -{ - struct cgraph_edge *e = node-callees; - if (!e) -return; - while (true) -if (!e-inline_failed e-callee-callees) - e = e-callee-callees; -else - { - struct cgraph_node *callee = cgraph_function_or_thunk_node (e-callee, - NULL); - - /* We inlined and thus callees might have different number of calls. - Reset their caches */ -reset_node_growth_cache (callee); - if (e-inline_failed) - update_caller_keys (heap, callee, updated_nodes, e); - if (e-next_callee) - e = e-next_callee; - else - { - do - { - if (e-caller == node) - return; - e = e-caller-callers; - } - while (!e-next_callee); - e = e-next_callee; - } - } -} - /* Enqueue all recursive calls from NODE into priority queue depending on how likely we want to recursively inline the call. */ @@ -1488,7 +1449,7 @@ inline_small_functions (void) at once. Consequently we need to update all callee keys. */ if (flag_indirect_inlining) add_new_edges_to_heap (heap, new_indirect_edges); - update_all_callee_keys (heap, where, updated_nodes); + update_callee_keys (heap, where, updated_nodes); } else { @@ -1527,18 +1488,7 @@ inline_small_functions (void) reset_edge_caches (edge-callee); reset_node_growth_cache (callee); - /* We inlined last offline copy to the body. This might lead -to callees of function having fewer call sites and thus they -may need updating. - -FIXME: the callee size could also shrink because more information -is propagated from caller. We don't track when this happen and -thus we need to recompute everything all the time. Once this is -solved, || 1 should go away. */ - if (callee-global.inlined_to || 1) - update_all_callee_keys (heap, callee, updated_nodes); - else - update_callee_keys (heap, edge-callee, updated_nodes); + update_callee_keys (heap, edge-callee, updated_nodes); } where = edge-caller; if (where-global.inlined_to) @@ -1551,11 +1501,6 @@ inline_small_functions (void) called by function we inlined (since number of it inlinable callers might change). */ update_caller_keys (heap, where, updated_nodes, NULL); - - /* We removed one call of the function we just inlined. If offline -copy is still needed, be sure to update the keys. */ - if (callee != where !callee-global.inlined_to) -update_caller_keys (heap, callee, updated_nodes, NULL); bitmap_clear (updated_nodes); if (dump_file)
[PATCH, i386]: Avodi movaps size optimizations for TARGET_AVX
Hello! There is no point to emit vmovaps instead of vmovapd or vmovdqa, these instructions have same sizes. Attached patch fixes this oversight for TARGET_AVX. 2012-05-11 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (*movti_internal_rex64): Avoid MOVAPS size optimization for TARGET_AVX. (*movti_internal_sse): Ditto. (*movdi_internal_rex64): Handle TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL. (*movdi_internal): Ditto. (*movsi_internal): Ditto. (*movtf_internal): Avoid MOVAPS size optimization for TARGET_AVX. (*movdf_internal_rex64): Ditto. (*movfd_internal): Ditto. (*movsf_internal): Ditto. * config/i386/sse.md (movmode): Handle TARGET_SSE_LOAD0_BY_PXOR. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: i386.md === --- i386.md (revision 187372) +++ i386.md (working copy) @@ -1890,12 +1890,15 @@ (set (attr mode) (cond [(eq_attr alternative 0,1) (const_string DI) - (ior (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) - (match_test optimize_function_for_size_p (cfun))) + (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) (const_string V4SF) (and (eq_attr alternative 4) (match_test TARGET_SSE_TYPELESS_STORES)) (const_string V4SF) + (match_test TARGET_AVX) +(const_string TI) + (match_test optimize_function_for_size_p (cfun)) +(const_string V4SF) ] (const_string TI)))]) @@ -1943,13 +1946,15 @@ [(set_attr type sselog1,ssemov,ssemov) (set_attr prefix maybe_vex) (set (attr mode) - (cond [(ior (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) - (match_test optimize_function_for_size_p (cfun))) + (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) (const_string V4SF) (and (eq_attr alternative 2) (match_test TARGET_SSE_TYPELESS_STORES)) (const_string V4SF) - (not (match_test TARGET_SSE2)) + (match_test TARGET_AVX) +(const_string TI) + (ior (not (match_test TARGET_SSE2)) + (match_test optimize_function_for_size_p (cfun))) (const_string V4SF) ] (const_string TI)))]) @@ -1970,8 +1975,11 @@ return movdq2q\t{%1, %0|%0, %1}; case TYPE_SSEMOV: - if (get_attr_mode (insn) == MODE_TI) + if (get_attr_mode (insn) == MODE_V4SF) + return %vmovaps\t{%1, %0|%0, %1}; + else if (get_attr_mode (insn) == MODE_TI) return %vmovdqa\t{%1, %0|%0, %1}; + /* Handle broken assemblers that require movd instead of movq. */ if (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])) return %vmovd\t{%1, %0|%0, %1}; @@ -2048,7 +2056,20 @@ (if_then_else (eq_attr alternative 10,11,12,13,14,15) (const_string maybe_vex) (const_string orig))) - (set_attr mode SI,DI,DI,DI,SI,DI,DI,DI,DI,DI,TI,DI,TI,DI,DI,DI,DI,DI)]) + (set (attr mode) + (cond [(eq_attr alternative 0,4) + (const_string SI) + (eq_attr alternative 10,12) + (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) + (const_string V4SF) +(match_test TARGET_AVX) + (const_string TI) +(match_test optimize_function_for_size_p (cfun)) + (const_string V4SF) + ] + (const_string TI)) + ] + (const_string DI)))]) ;; Reload patterns to support multi-word load/store ;; with non-offsetable address. @@ -2142,7 +2163,7 @@ case MODE_DI: return %vmovq\t{%1, %0|%0, %1}; case MODE_V4SF: - return movaps\t{%1, %0|%0, %1}; + return %vmovaps\t{%1, %0|%0, %1}; case MODE_V2SF: return movlps\t{%1, %0|%0, %1}; default: @@ -2189,7 +2210,22 @@ (if_then_else (eq_attr alternative 5,6,7,8) (const_string maybe_vex) (const_string orig))) - (set_attr mode DI,DI,DI,DI,DI,TI,DI,TI,DI,V4SF,V2SF,V4SF,V2SF,DI,DI)]) + (set (attr mode) + (cond [(eq_attr alternative 9,11) + (const_string V4SF) + (eq_attr alternative 10,12) + (const_string V2SF) + (eq_attr alternative 5,7) + (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) + (const_string V4SF) +(match_test TARGET_AVX) + (const_string TI) +(match_test optimize_function_for_size_p (cfun)) +
[C++ Patch] PR 53305
Hi, an ICE on invalid (per Daniel's analysis): when r is NULL_TREE the next DECL_CONTEXT (r) can only crash. Plus a garbled error message because pp_cxx_simple_type_specifier doesn't handle BOUND_TEMPLATE_TEMPLATE_PARM. Tested x86_64-linux. Thanks, Paolo. /// /cp 2012-05-11 Paolo Carlini paolo.carl...@oracle.com PR c++/53305 * pt.c (tsubst_copy: case PARM_DECL): Return error_mark_node if tsubst_decl returns NULL_TREE. * cxx-pretty-print.c (pp_cxx_simple_type_specifier): Handle BOUND_TEMPLATE_TEMPLATE_PARM. /testsuite 2012-05-11 Paolo Carlini paolo.carl...@oracle.com PR c++/53305 * g++.dg/cpp0x/variadic132.C: New. Index: testsuite/g++.dg/cpp0x/variadic132.C === --- testsuite/g++.dg/cpp0x/variadic132.C(revision 0) +++ testsuite/g++.dg/cpp0x/variadic132.C(revision 0) @@ -0,0 +1,27 @@ +// PR c++/53305 +// { dg-do compile { target c++11 } } + +templateclass... Ts struct tuple { }; + +struct funct +{ + templateclass... argTs + int operator()(argTs...); +}; + +templateclass... class test; + +templatetemplate class... class tp, +class... arg1Ts, class... arg2Ts +class testtparg1Ts..., tparg2Ts... +{ + templateclass func, class...arg3Ts +auto test2(func fun, arg1Ts... arg1s, arg3Ts... arg3s) +- decltype(fun(arg1s..., arg3s...)); +}; + +int main() +{ + testtuple, tuplechar,int t2; + t2.test2(funct(), 'a', 2); // { dg-error no matching function } +} Index: cp/cxx-pretty-print.c === --- cp/cxx-pretty-print.c (revision 187376) +++ cp/cxx-pretty-print.c (working copy) @@ -1261,6 +1261,7 @@ pp_cxx_simple_type_specifier (cxx_pretty_printer * case TEMPLATE_TYPE_PARM: case TEMPLATE_TEMPLATE_PARM: case TEMPLATE_PARM_INDEX: +case BOUND_TEMPLATE_TEMPLATE_PARM: pp_cxx_unqualified_id (pp, t); break; Index: cp/pt.c === --- cp/pt.c (revision 187376) +++ cp/pt.c (working copy) @@ -12084,6 +12084,8 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com not the following PARM_DECLs that are chained to T. */ c = copy_node (t); r = tsubst_decl (c, args, complain); + if (!r) + return error_mark_node; /* Give it the template pattern as its context; its true context hasn't been instantiated yet and this is good enough for mangling. */
[PATCH, alpha]: Fix ICE in alpha_emit_conditional_move, at config/alpha/alpha.c:2649
Hello! Recently testsuite/gcc.c-torture/execute/ieee/pr50310.c started to ICE when compiled with -O3 -mieee on alphaev68-pc-linux-gnu: $ ~/gcc-build-alpha/gcc/cc1 -O3 -mieee -quiet pr50310.c pr50310.c: In function ‘foo’: pr50310.c:31:20: internal compiler error: in alpha_emit_conditional_move, at config/alpha/alpha.c:2649 s3[10 * 4 + i] = __builtin_isunordered (s1[i], s2[i]) ? -1.0 : 0.0; ^ Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. It turned out that UNORDERED and ORDERED RTX codes are not handled in alpha_emit_conditional_move. Attached patch fixes this oversight. 2012-05-11 Uros Bizjak ubiz...@gmail.com * config/alpha/alpha.c (alpha_emit_conditional_branch): Handle ORDERED and UNORDERED conditions. Patch was bootstrapped and regression tested on alphaev68-pc-linux-gnu. OK for mainline SVN and release branches? Uros. Index: config/alpha/alpha.c === --- config/alpha/alpha.c(revision 187371) +++ config/alpha/alpha.c(working copy) @@ -2335,7 +2335,7 @@ alpha_emit_conditional_branch (rtx operands[], enu { case EQ: case LE: case LT: case LEU: case LTU: case UNORDERED: - /* We have these compares: */ + /* We have these compares. */ cmp_code = code, branch_code = NE; break; @@ -2572,13 +2572,15 @@ alpha_emit_conditional_move (rtx cmp, enum machine switch (code) { case EQ: case LE: case LT: case LEU: case LTU: + case UNORDERED: /* We have these compares. */ cmp_code = code, code = NE; break; case NE: - /* This must be reversed. */ - cmp_code = EQ, code = EQ; + case ORDERED: + /* These must be reversed. */ + cmp_code = reverse_condition (code), code = EQ; break; case GE: case GT: case GEU: case GTU: @@ -2627,11 +2629,13 @@ alpha_emit_conditional_move (rtx cmp, enum machine switch (code) { case EQ: case LE: case LT: case LEU: case LTU: +case UNORDERED: /* We have these compares: */ break; case NE: - /* This must be reversed. */ +case ORDERED: + /* These must be reversed. */ code = reverse_condition (code); cmov_code = EQ; break;
Re: [MIPS] Fix misspelled macro in t-vxworks
2012/5/11 Richard Sandiford rdsandif...@googlemail.com: Mingjie Xing mingjie.x...@gmail.com writes: This patch fix the misspelled macro in t-vxworks. Is it OK? 2012-05-10 Mingjie Xing mingjie.x...@gmail.com * config/mips/t-vxworks: Change MUTLILIB_EXTRA_OPTS to MULTILIB_EXTRA_OPTS. OK, thanks. Richard Committed revision 187392. Regards, Mingjie
Re: [C++ Patch] PR 53305
On Thu, May 10, 2012 at 6:40 PM, Paolo Carlini paolo.carl...@oracle.com wrote: Hi, an ICE on invalid (per Daniel's analysis): when r is NULL_TREE the next DECL_CONTEXT (r) can only crash. Plus a garbled error message because pp_cxx_simple_type_specifier doesn't handle BOUND_TEMPLATE_TEMPLATE_PARM. Tested x86_64-linux. Thanks, Paolo. /// Stylistically, I would write if (r == NULL) or if (r == NULL_TREE) Patch OK with that change. -- Gaby
PATCH: Add RTM support to -march=native
Hi, This patch adds RTM support to -march=native. Tested on Linux/x86-64. OK for trunk? Thanks. H.J. --- 2012-05-10 H.J. Lu hongjiu...@intel.com * config/i386/driver-i386.c (host_detect_local_cpu): Support RTM. diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c index 8fe7ab8..e93e8d9 100644 --- a/gcc/config/i386/driver-i386.c +++ b/gcc/config/i386/driver-i386.c @@ -397,7 +397,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) unsigned int has_pclmul = 0, has_abm = 0, has_lwp = 0; unsigned int has_fma = 0, has_fma4 = 0, has_xop = 0; unsigned int has_bmi = 0, has_bmi2 = 0, has_tbm = 0, has_lzcnt = 0; - unsigned int has_hle = 0; + unsigned int has_hle = 0, has_rtm = 0; bool arch; @@ -458,6 +458,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) has_bmi = ebx bit_BMI; has_hle = ebx bit_HLE; + has_rtm = ebx bit_RTM; has_avx2 = ebx bit_AVX2; has_bmi2 = ebx bit_BMI2; } @@ -731,10 +732,11 @@ const char *host_detect_local_cpu (int argc, const char **argv) const char *sse4_1 = has_sse4_1 ? -msse4.1 : -mno-sse4.1; const char *lzcnt = has_lzcnt ? -mlzcnt : -mno-lzcnt; const char *hle = has_hle ? -mhle : -mno-hle; + const char *rtm = has_rtm ? -mrtm : -mno-rtm; options = concat (options, cx16, sahf, movbe, ase, pclmul, popcnt, abm, lwp, fma, fma4, xop, bmi, bmi2, - tbm, avx, avx2, sse4_2, sse4_1, lzcnt, + tbm, avx, avx2, sse4_2, sse4_1, lzcnt, rtm, hle, NULL); }
Re: [h8300] increase dwarf address size
On 05/10/2012 11:21 AM, DJ Delorie wrote: Regardless, shouldn't DWARF2_ADDR_SIZE be POINTER_SIZE / BITS_PER_UNIT? That's the default. It doesn't work because pointers are still 16 bits. Something's still not right then. The H8/300 has 16 bit pointers and a 64k address space and all the processors in the family still support that mode. Jeff
Re: [h8300] increase dwarf address size
That's the default. It doesn't work because pointers are still 16 bits. Something's still not right then. The H8/300 has 16 bit pointers and a 64k address space and all the processors in the family still support that mode. The problem is when a single object is more than 64k for the models that have 16 bit pointers. No, they won't work on hardware, but we can't even *compile* them because of the dwarf limitation, which means all the other models can't support such large objects. I.e. the 2-byte dwarf addresses for H8/300 prevent us from supporting C++ on the larger H8/300SX processors.
Re: [h8300] increase dwarf address size
On 05/10/2012 09:55 PM, DJ Delorie wrote: That's the default. It doesn't work because pointers are still 16 bits. Something's still not right then. The H8/300 has 16 bit pointers and a 64k address space and all the processors in the family still support that mode. The problem is when a single object is more than 64k for the models that have 16 bit pointers. No, they won't work on hardware, but we can't even *compile* them because of the dwarf limitation, which means all the other models can't support such large objects. I.e. the 2-byte dwarf addresses for H8/300 prevent us from supporting C++ on the larger H8/300SX processors. Right, so ISTM the way to fix this is to not build the C++ runtime for normal mode, rather than hack up the dwarf address size. Now I know there's currently no way to do that, but that seems to me to be the correct fix. Unfortunately it's going to probably look a whole lot like multilib exceptions... E. Jeff
Re: [h8300] increase dwarf address size
Whereas making dwarf addresses always 32 bits only affects debugging info size (not rom image size) on the oldest and smallest H8/300 variant, where real-world code would have a limited amount of debug information anyway.
Re: [C++ Patch] PR 53301
On 05/10/2012 05:31 PM, Paolo Carlini wrote: Let's see if we can do something *now* ;) My concrete proposal would be: TYPE_PTRMEM_P rename to TYPE_PTRDATAMEM_P (consistent with TYPE_PTRMEMFUNC_P) TYPE_PTR_TO_MEMBER_P rename to TYPE_PTRMEM_P and then finally #define TYPE_PTR_OR_PTRMEM_P(NODE) \ (TYPE_PTR_P (NODE) || TYPE_PTRMEM_P (NODE)) and use it everywhere. Sounds like an improvement? Sounds pretty good. But I suspect a lot of places want to check TYPE_PTR_P || TYPE_PTRDATAMEM_P (because you can't just use TREE_TYPE to get the function type of a PMF), so this new macro doesn't help with your desire to avoid writing ||. Additionally, we could maybe rename PTRMEM_OK_P to PTRMEMFUNC_OK_P No, that flag applies to both varieties of members. Jason
Re: [patch] support for multiarch systems
On 10.05.2012 08:42, Paolo Bonzini wrote: Il 09/05/2012 19:19, Matthias Klose ha scritto: these are referenced from the http://wiki.debian.org/Multiarch/Tuples https://wiki.ubuntu.com/MultiarchSpec#Filesystem_layout http://err.no/debian/amd64-multiarch-3 http://wiki.debian.org/Multiarch/TheCaseForMultiarch describes use cases for multiarch, and why Debian thinks that the existing approaches are not sufficient (having name collisions for different architectures or ad hoc names for new architectures like libx32). That may be contentious within the Linux community, but I would like to avoid this kind of discussion here. I don't care about contentiousness, I just would like this to be documented somewhere (for example in the internals manual where MULTILIB_* is documented too). ok, I did clarify it in the existing documentation of MULTIARCH_DIRNAME in fragments.texi, detailing the search order for the files. Should the search order be mentioned in some user documentation as well? if yes, where? Matthias Index: doc/fragments.texi === --- doc/fragments.texi (revision 187337) +++ doc/fragments.texi (working copy) @@ -152,6 +152,52 @@ of options to be used for all builds. If you set this, you should probably set @code{CRTSTUFF_T_CFLAGS} to a dash followed by it. +@findex MULTILIB_OSDIRNAMES +@item MULTILIB_OSDIRNAMES +If @code{MULTILIB_OPTIONS} is used, this variable specifies the list +of OS subdirectory names. The format is either the same as of +@code{MULTILIB_DIRNAMES}, or a set of mappings. When it is the same +as @code{MULTILIB_DIRNAMES}, it describes the multilib directories +using OS conventions, rather than GCC conventions. When it is a set +of mappings of the form @var{gccdir}=@var{osdir}, the left side gives +the GCC convention and the right gives the equivalent OS defined +location. If the @var{osdir} part begins with a @samp{!}, the os +directory names are used exclusively. Use the mapping when there is +no one-to-one equivalence between GCC levels and the OS. + +For multilib enabled configurations (see @code{MULTIARCH_DIRNAME}) +below), the multilib name is appended to each directory name, separated +by a colon (e.g. @samp{../lib:x86_64-linux-gnu}). + +@findex MULTIARCH_DIRNAME +@item MULTIARCH_DIRNAME +If @code{MULTIARCH_DIRNAME} is used, this variable specifies the +multiarch name for this configuration. For multiarch enabled +configurations it is used to search libraries, crt files and system +header files in additional locations. + +Libraries and crt files are searched first in +@var{prefix}/@var{multiarch} before @var{prefix} for each @var{prefix} +added by @code{add_prefix} or @code{add_sysrooted_prefix}. +System header files are searched first in +@code{LOCAL_INCLUDE_DIR}/@var{multiarch} before +@code{LOCAL_INCLUDE_DIR}, and in +@code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch} before +@code{NATIVE_SYSTEM_HEADER_DIR}. + +E.g. for a multiarch enabled system compiler +@file{/lib/@var{multiarch}} is searched before @file{/lib} and +@file{/usr/lib/@var{multiarch}} before @file{/usr/lib}, and system +header files are searched in @file{/usr/local/include/@var{multiarch}} +before @file{/usr/local/include} and in +@file{/usr/include/@var{multiarch}} before @file{/usr/include}. + +@code{MULTIARCH_DIRNAME} is not used for multilib enabled +configurations, but encoded in @code{MULTILIB_OSDIRNAMES} instead. + +The multiarch tuples are defined +in @uref{http://wiki.debian.org/Multiarch/Tuples}. + @findex SPECS @item SPECS Unfortunately, setting @code{MULTILIB_EXTRA_OPTS} is not enough, since
Go patch committed: Remove incorrect ChangeLog entry
As described in gcc/go/README.gcc, the files in gcc/go/gofrontend are copied from http://code.google.com/p/gofrontend . Changes should be committed there first, and mirrored to the GCC repository. Also, changes committed to that repository are not listed in gcc/go/ChangeLog. A recent patch updated gogo-tree.c for the change in the name of cgraph_finalize_compilation_unit. I have applied that patch to the gofrontend repository, and I have applied this patch to the GCC repository to remove the incorrect ChangeLog entry. Note that changes to ChangeLog files do not themselves get ChangeLog entries. Ian Index: gcc/go/ChangeLog === --- gcc/go/ChangeLog (revision 187393) +++ gcc/go/ChangeLog (working copy) @@ -14,10 +14,6 @@ * gccgo.texi (Invoking gccgo): Document -fgo-pkgpath. Update the docs for -fgo-prefix. -2012-04-30 Jan Hubicka j...@suse.cz - - * gogo-tree.cc (Gogo::write_globals): Use finalize_compilation_unit. - 2012-04-23 Ian Lance Taylor i...@google.com * go-lang.c (go_langhook_init): Set MPFR precision to 256.