Re: [patch] support for multiarch systems

2012-05-10 Thread Paolo Bonzini
Il 09/05/2012 19:19, Matthias Klose ha scritto:
 these are referenced from the http://wiki.debian.org/Multiarch/Tuples
 https://wiki.ubuntu.com/MultiarchSpec#Filesystem_layout
 http://err.no/debian/amd64-multiarch-3
 
 http://wiki.debian.org/Multiarch/TheCaseForMultiarch describes use cases for
 multiarch, and why Debian thinks that the existing approaches are not 
 sufficient
 (having name collisions for different architectures or ad hoc names for new
 architectures like libx32).  That may be contentious within the Linux 
 community,
 but I would like to avoid this kind of discussion here.

I don't care about contentiousness, I just would like this to be
documented somewhere (for example in the internals manual where
MULTILIB_* is documented too).

Paolo


Re: [C Patch]: pr52543

2012-05-10 Thread Paolo Bonzini
Il 30/03/2012 12:08, Richard Sandiford ha scritto:
  +   There are two useful preprocessor defines for use by maintainers:  
  +
  +   #define LOG_COSTS
  +
  +   if you wish to see the actual cost estimates that are being used
  +   for each mode wider than word mode and the cost estimates for zero
  +   extension and the shifts.   This can be useful when port maintainers 
  +   are tuning insn rtx costs.
  +
  +   #define FORCE_LOWERING
  +
  +   if you wish to test the pass with all the transformation forced on.
  +   This can be useful for finding bugs in the transformations.
 Must admit I'm not keen on these kinds of macro, but it's Ian's call.

Indeed, LOG_COSTS should be (dump_flags  TDF_DETAILS) != 0, and perhaps
FORCE_LOWERING should be a -f flag (like -flower-all-subregs) or a --param.

Paolo


Re: [C Patch]: pr52543

2012-05-10 Thread Paolo Bonzini
Il 10/05/2012 08:45, Paolo Bonzini ha scritto:
 Il 30/03/2012 12:08, Richard Sandiford ha scritto:
 +   There are two useful preprocessor defines for use by maintainers:  
 +
 +   #define LOG_COSTS
 +
 +   if you wish to see the actual cost estimates that are being used
 +   for each mode wider than word mode and the cost estimates for zero
 +   extension and the shifts.   This can be useful when port maintainers 
 +   are tuning insn rtx costs.
 +
 +   #define FORCE_LOWERING
 +
 +   if you wish to test the pass with all the transformation forced on.
 +   This can be useful for finding bugs in the transformations.
 Must admit I'm not keen on these kinds of macro, but it's Ian's call.
 
 Indeed, LOG_COSTS should be (dump_flags  TDF_DETAILS) != 0, and perhaps
 FORCE_LOWERING should be a -f flag (like -flower-all-subregs) or a --param.

Not sure how this got sent a month after I wrote it (and decided not to
send it). :)

Paolo


Re: [PATCH] Optimize byte_from_pos, pos_from_bit

2012-05-10 Thread Richard Guenther
On Wed, 9 May 2012, Eric Botcazou wrote:

  This optimizes byte_from_pos and pos_from_bit by noting that we operate
  on sizes whose computations have no intermediate (or final) overflow.
  This is the single patch necessary to get Ada to bootstrap and test
  with TYPE_IS_SIZETYPE removed.  Rather than amending size_binop
  (my original plan) I chose to optimize the above two commonly used
  accessors.
 
  Conveniently normalize_offset can be re-written to use pos_from_bit
  instead of inlinig it.  I also took the liberty to document the
  functions (sic).
 
 Nice, thanks.  Could you add a blurb, in the head comment of the first 
 function 
 in which you operate under the no-overflow assumption, stating this fact and 
 why this is necessary (an explicit mention of Ada isn't forbidden ;-), as 
 well 
 as a cross-reference to it in the head comment of the other function(s).

Like this?

Thanks,
Richard.

2012-05-10  Richard Guenther  rguent...@suse.de

* stor-layout.c (byte_from_pos): Amend comment.

Index: gcc/stor-layout.c
===
--- gcc/stor-layout.c   (revision 187362)
+++ gcc/stor-layout.c   (working copy)
@@ -798,7 +798,11 @@ bit_from_pos (tree offset, tree bitpos)
 }
 
 /* Return the combined truncated byte position for the byte offset OFFSET and
-   the bit position BITPOS.  */
+   the bit position BITPOS.
+   These functions operate on byte and bit positions as present in FIELD_DECLs
+   and it assumes that expressions result in no (intermediate) overflow.
+   This assumption is necessary to optimize these values as much as possible,
+   especially to make Ada happy.  */
 
 tree
 byte_from_pos (tree offset, tree bitpos)


Re: [patch] Fix LTO regression in Ada

2012-05-10 Thread Richard Guenther
On Wed, May 9, 2012 at 10:38 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Hi,

 this is a regression present on mainline and 4.7 branch.  On the attached
 testcase, the compiler aborts in LTO mode with:

 eric@atlantis:~/build/gcc/native32 gcc/xgcc -Bgcc -S lto11.adb -O -flto
 +===GNAT BUG DETECTED==+
 | 4.8.0 20120506 (experimental) [trunk revision 187216] (i586-suse-linux)
 | tree code 'call_expr' is not supported in LTO streams

 The problem is that the Ada compiler started to use DECL_ORIGINAL_TYPE in 
 4.7.x
 and the type in this field can have arbitrary expressions as TYPE_SIZE, for
 example expressions with CALL_EXPRs.  Now the type is both not gimplified and
 streamed in LTO mode, so the CALL_EXPRs are sent to the streamer as-is.

 The immediate solution would be not to stream DECL_ORIGINAL_TYPE (and clear it
 in free_lang_data_in_decl), but this yields a regression in C++ with -flto -g
 (ICE in splice_child_die).  Therefore, the patch implements the alternate
 solution of gimplifying DECL_ORIGINAL_TYPE.

 Bootstrapped/regtested on x86_64-suse-linux, OK for mainline and 4.7 branch?

Hmm, but we will not possibly refer to the sizes therein, so emitting
stmts for them
looks pointless ... (and in fact the debug information would be odd,
too ... what
does dwarf2out.c do with these CALL_EXPRs when generating debug information
without LTO?)

Anyway, the idea is reasonable, but I'm not sure ending up with calls in those
sizes makes sense (don't we make sure to inline them all at some point?)

Thanks,
Richard.


 2012-05-09  Eric Botcazou  ebotca...@adacore.com

        * gimplify.c (gimplify_decl_expr): For a TYPE_DECL, gimplify the
        DECL_ORIGINAL_TYPE if it is present.


 2012-05-09  Eric Botcazou  ebotca...@adacore.com

        * gnat.dg/lto11.ad[sb]: New test.


 --
 Eric Botcazou


Re: [PATCH] Remove TYPE_IS_SIZETYPE

2012-05-10 Thread Richard Guenther
On Wed, 9 May 2012, Eric Botcazou wrote:

  This removes the TYPE_IS_SIZETYPE macro and all its uses (by
  assuming it returns zero and applying trivial folding).  Sizes
  and bitsizes can still be treat specially by means of knowing
  what the values represent and by means of using helper functions
  that assume you are dealing with sizes (in particular size_binop
  and friends and bit_from_pos, byte_from_pos or pos_from_bit).
 
 Fine with me, if you add the blurb I talked about in the other reply.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages
  including Ada with the patch optimizing bute_from_pos and pos_from_bit
 
 Results on our internal testsuite are clean on x86-64 and almost clean on 
 x86, 
 an exception being:
 
 package t is
 type x (m : natural) is record
 s : string (1 .. m);
 r : natural;
 b : boolean;
 end record;
 for x'alignment use 4;
 
 pragma Pack (x);
 end t;
 
 Without the patches, compiling the package with -gnatR3 yields:
 
 Representation information for unit t (spec)
 
 
 for x'Object_Size use 17179869248;
 for x'Value_Size use  ((#1 + 8) * 8) ;
 for x'Alignment use 4;
 for x use record
m at  0 range  0 .. 30;
s at  4 range  0 ..  ((#1 * 8))  - 1;
r at bit offset (((#1 + 4) * 8))  size in bits = 31
b at bit offset #1 + 7) * 8) + 7))  size in bits = 1
 end record;
 
 With the patches, this yields:
 
 Representation information for unit t (spec)
 
 
 for x'Object_Size use 17179869248;
 for x'Value_Size use  (((#1 + 7) + 1) * 8) ;
 for x'Alignment use 4;
 for x use record
m at  0 range  0 .. 30;
s at  4 range  0 ..  ((#1 * 8))  - 1;
r at bit offset (((#1 + 4) * 8))  size in bits = 31
b at bit offset #1 + 7) * 8) + 7))  size in bits = 1
 end record;
 
 so we have lost a simple folding for x'Value_Size (TYPE_ADA_SIZE field).

That's interesting.  It is always safe to fold (x + 7) + 1 to
(x + 8), independent on whether overflow is defined or not.  So this
looks like a genuine missed folding (I think that the combiner
in tree-ssa-forwprop.c catches this).  Or is the above not showing
casts in the expression?  Folding would be not valid for
(unsigned)(signed X + 7) + 1.

  2012-05-08  Richard Guenther  rguent...@suse.de
 
  ada/
  * gcc-interface/cuintp.c (UI_From_gnu): Remove TYPE_IS_SIZETYPE use.
 
 OK, modulo the formatting:

Adjusted and applied.

Thanks,
Richard.

  Index: trunk/gcc/ada/gcc-interface/cuintp.c
  ===
  *** trunk.orig/gcc/ada/gcc-interface/cuintp.c   2011-04-11 
  17:01:30.0
  +0200 --- trunk/gcc/ada/gcc-interface/cuintp.c  2012-05-07
  16:43:43.497218058 +0200 *** UI_From_gnu (tree Input)
  *** 178,186 
  if (host_integerp (Input, 0))
return UI_From_Int (TREE_INT_CST_LOW (Input));
  else if (TREE_INT_CST_HIGH (Input)  0
  !   TYPE_UNSIGNED (gnu_type)
  !   !(TREE_CODE (gnu_type) == INTEGER_TYPE
  !TYPE_IS_SIZETYPE (gnu_type)))
return No_Uint;
#endif
 
  --- 178,184 
  if (host_integerp (Input, 0))
return UI_From_Int (TREE_INT_CST_LOW (Input));
  else if (TREE_INT_CST_HIGH (Input)  0
  !   TYPE_UNSIGNED (gnu_type))
return No_Uint;
#endif
 
  TYPE_UNSIGNED (gnu_type)) on the same line.


Re: [Patch / RFC] Improving more locations for binary expressions

2012-05-10 Thread Manuel López-Ibáñez
On 10 May 2012 07:55, Miles Bader mi...@gnu.org wrote:
 Paolo Carlini paolo.carl...@oracle.com writes:
 in case my message ends up garbled, the carets do not point to 
 (column 13), two times point to b (column 20), which is obviously
 wrong. In other terms, all the columns are 20, all wrong.

 The new caret support does seem to have revealed a bunch of places
 where the column info in error messages was pretty screwy... I guess
 nobody paid much attention to it before... :]

 Should these get reported as bugzilla bugs...?

In principle, yes. In practice, there are already so many known issues
that adding more would just waste contributors time doing bugzilla
administration.

So help would very much appreciated. In particular, the C FE and the
preprocessor are in much worse shape in terms of locations than the
C++ FE.

Some issues may be hard but many of them are a matter of setting a
breakpoint at the error, going up the frame, and figuring out where
the correct location could be got from. Then passing it down to the
error so it can use the correct location. If you can figure out that
but can't/won't write a patch, then please open a PR.

Cheers,

Manuel.


Re: [PATCH libcpp]: Avoid crash in interpret_float_suffix

2012-05-10 Thread Tristan Gingold

On May 8, 2012, at 5:39 PM, Tom Tromey wrote:

 Tristan == Tristan Gingold ging...@adacore.com writes:
 
 Tristan 2012-05-04  Tristan Gingold  ging...@adacore.com
 Tristan  * expr.c (interpret_float_suffix): Add a guard.
 
 Ok.

Thanks, now committed.



Re: Bug 53289 - unnecessary repetition of caret diagnostics

2012-05-10 Thread Richard Guenther
On Wed, May 9, 2012 at 11:02 PM, Manuel López-Ibáñez
lopeziba...@gmail.com wrote:
 Simple enough. Bootstrapped and regression tested.

 The output for the example in the PR is now:

 /home/manuel/caret-overload.C:6:6: error: no matching function for
 call to ‘g(int)’
   g(1);
      ^
 /home/manuel/caret-overload.C:6:6: note: candidate is:
 /home/manuel/caret-overload.C:2:18: note: templateclass T typename
 T::type g(T)
  typename T::type g(T);
                  ^

Does it make sense to print a caret here?  We are dumping a function
decl(?), thus already constraining what we print to exactly what is
important.  So - maybe simply never emit a caret for %D locations?

 /home/manuel/caret-overload.C:2:18: note:   template argument
 deduction/substitution failed:
 /home/manuel/caret-overload.C: In substitution of ‘templateclass T
 typename T::type g(T) [with T = int]’:
 /home/manuel/caret-overload.C:6:6:   required from here
 /home/manuel/caret-overload.C:2:18: error: ‘int’ is not a class,
 struct, or union type

 OK?

 2012-05-09  Manuel López-Ibáñez  m...@gcc.gnu.org

        PR c++/53289
 gcc/
        * diagnostic.h (diagnostic_context): Add last_location.
        * diagnostic.c (diagnostic_initialize): Initialize it.
        (diagnostic_show_locus): Use it.


Re: [C++ Patch] fix semi-random template specialization ICE

2012-05-10 Thread Dodji Seketeli
Alexandre Oliva aol...@redhat.com a écrit:

[...]

 Anyway, the problem is that, for some unsuitable candidate template
 specializations, tsubst returns error_mark_node, which tsubst_decl
 stores in argvec, and later on register_specialization gets this
 error_mark_node and tries to access it as a tree_vec.

 The trivial patch that avoids the misbehavior is returning
 error_mark_node as soon as we get that for argvec.  Bootstrapped on
 i686-pc-linux-gnu and x86_64-linux-gnu, regstrapped on the latter.

 Ok to install?

FYI, this has been reported as
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53209, so you might add a
reference to that bug in the ChangeLog.

Other than that, I cannot approve or reject this patch, but FWIW, it
looks fine to me.

Let's CC jason.

 for  gcc/cp/ChangeLog
 from  Alexandre Oliva  aol...@redhat.com

   * pt.c (tsubst_decl): Bail out if argvec is error_mark_node.

 Index: gcc/cp/pt.c
 ===
 --- gcc/cp/pt.c.orig  2012-04-30 15:34:44.018432544 -0300
 +++ gcc/cp/pt.c   2012-04-30 15:34:47.988375071 -0300
 @@ -10626,6 +10626,8 @@ tsubst_decl (tree t, tree args, tsubst_f
   tmpl = DECL_TI_TEMPLATE (t);
   gen_tmpl = most_general_template (tmpl);
   argvec = tsubst (DECL_TI_ARGS (t), args, complain, in_decl);
 + if (argvec == error_mark_node)
 +   RETURN (error_mark_node);
   hash = hash_tmpl_and_args (gen_tmpl, argvec);
   spec = retrieve_specialization (gen_tmpl, argvec, hash);
 }

Thanks.

-- 
Dodji


Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-10 Thread Richard Guenther
On Thu, May 10, 2012 at 2:31 AM, Xinliang David Li davi...@google.com wrote:
 Bummer.  I was thinking to reserve '=' for selective  dumping:

 -fdump-tree-pre=func_list_regexp

 I guess this can be achieved via @

 -fdump-tree-pre@func_list

 -fdump-tree-pre=file_name@func_list


 Another issue -- I don't think the current precedence rule is correct.
 Consider that -fopt-info=2 will be mapped to

 -fdump-tree-all-transform-verbose2=stderr
 -fdump-rtl-all-transform-verbose2=stderr

 then

 the current precedence rule will cause surprise when the following is used

 -fopt-info -fdump-tree-pre

 The PRE dump will be emitted to stderr which is not what user wants.
 In short, special streams should be treated as 'weak' the same way as
 your previous implementation.

Hm, this raises a similar concern I have with the -fvectorizer-verbose flag.
With -fopt-info -fdump-tree-pre I do not want some information to be
present only on stderr or in the dump file!  I want it in _both_ places!
(-fvectorizer-verbose makes the -fdump-tree-vect dump contain less
information :()

Thus, the information where dumping goes has to be done differently
(which is why I asked for some re-org originally, so that passes no
longer explicitely reference dump_file - dump_file may be different
for different kind of information it dumps!).  Passes should, instead of

  fprintf (dump_file, ..., ...)

do

 dump_printf (TDF_scev, ..., ...)

thus, specify the kind of information they dump (would be mostly
TDF_details vs. 0 today I guess).  The dump_printf routine would
then properly direct to one or more places to dump at.

I realize this needs some more dispatchers for dumping expressions
and statements (but it should not be too many).  Dumping to
dump_file would in any case dump to the passes private dump file
only (unqualified stuff would never be useful for -fopt-info).

The perfect candidate to convert to this kind of scheme is obviously
the vectorizer with its existing -fvectorizer-verbose.

If the patch doesn't work towards this kind of end-result I'd rather
not have it.

Thanks,
Richard.

 thanks,

 David



 On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai sing...@google.com wrote:
 Thanks for your suggestions/comments. I have updated the patch and
 documentation. It supports the following usage:

 gcc  -fdump-tree-all=tree.dump -fdump-tree-pre=stdout
 -fdump-rtl-ira=ira.dump

 Here all tree dumps except the PRE are output into tree.dump, PRE dump
 goes to stdout and the IRA dump goes to ira.dump.

 Thanks,
 Sharad

 2012-05-09   Sharad Singhai  sing...@google.com

        * doc/invoke.texi: Add documentation for the new option.
        * tree-dump.c (dump_get_standard_stream): New function.
        (dump_files): Update for new field.
        (dump_switch_p_1): Handle dump filenames.
        (dump_begin): Likewise.
        (get_dump_file_name): Likewise.
        (dump_end): Remove attribute.
        (dump_enable_all): Add new parameter FILENAME.
        All callers updated.
        (enable_rtl_dump_file):
        * tree-pass.h (enum tree_dump_index): Add new constant.
        (struct dump_file_info): Add new field FILENAME.
        * testsuite/g++.dg/other/dump-filename-1.C: New test.

 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi     (revision 187265)
 +++ doc/invoke.texi     (working copy)
 @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio

  @item -d@var{letters}
  @itemx -fdump-rtl-@var{pass}
 +@itemx -fdump-rtl-@var{pass}=@var{filename}
  @opindex d
  Says to make debugging dumps during compilation at times specified by
  @var{letters}.  This is used for debugging the RTL-based passes of the
  compiler.  The file names for most of the dumps are made by appending
  a pass number and a word to the @var{dumpname}, and the files are
 -created in the directory of the output file.  Note that the pass
 -number is computed statically as passes get registered into the pass
 -manager.  Thus the numbering is not related to the dynamic order of
 -execution of passes.  In particular, a pass installed by a plugin
 -could have a number over 200 even if it executed quite early.
 -@var{dumpname} is generated from the name of the output file, if
 -explicitly specified and it is not an executable, otherwise it is the
 -basename of the source file. These switches may have different effects
 -when @option{-E} is used for preprocessing.
 +created in the directory of the output file. If the
 +@option{=@var{filename}} is appended to the longer form of the dump
 +option then the dump is done on that file instead of numbered
 +files. Note that the pass number is computed statically as passes get
 +registered into the pass manager.  Thus the numbering is not related
 +to the dynamic order of execution of passes.  In particular, a pass
 +installed by a plugin could have a number over 200 even if it executed
 +quite early.  @var{dumpname} is generated from the name 

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-10 Thread Sharad Singhai
Okay, I have restored the original behavior where standard streams
were considered weak. Thus in case of a conflict, the
standard streams have lower precedence. For example,

   gcc -O2 -fdump-tree-pre=stdout -fdump-tree-pre ...

does the PRE dump in auto numbered file since stdout has lower
precedence. Also this works as expected,

   gcc -O2 -fdump-tree-pre=pre.txt -fdump-tree-all=stderr ...

It outputs PRE dump to pre.txt while the remaining tree dumps are
output on to stderr. Does it look okay?

Thanks,
Sharad


2012-05-09   Sharad Singhai  sing...@google.com

* doc/invoke.texi: Add documentation for the new option.
* tree-dump.c (dump_get_standard_stream): New function.
(dump_files): Update for new field.
(dump_switch_p_1): Handle dump filenames.
(dump_begin): Likewise.
(get_dump_file_name): Likewise.
(dump_end): Remove attribute.
(dump_enable_all): Add new parameter FILENAME.
All callers updated.
* tree-pass.h (enum tree_dump_index): Add new constant.
(struct dump_file_info): Add new field FILENAME.
* testsuite/g++.dg/other/dump-filename-1.C: New test.

Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 187265)
+++ doc/invoke.texi (working copy)
@@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio

 @item -d@var{letters}
 @itemx -fdump-rtl-@var{pass}
+@itemx -fdump-rtl-@var{pass}=@var{filename}
 @opindex d
 Says to make debugging dumps during compilation at times specified by
 @var{letters}.  This is used for debugging the RTL-based passes of the
 compiler.  The file names for most of the dumps are made by appending
 a pass number and a word to the @var{dumpname}, and the files are
-created in the directory of the output file.  Note that the pass
-number is computed statically as passes get registered into the pass
-manager.  Thus the numbering is not related to the dynamic order of
-execution of passes.  In particular, a pass installed by a plugin
-could have a number over 200 even if it executed quite early.
-@var{dumpname} is generated from the name of the output file, if
-explicitly specified and it is not an executable, otherwise it is the
-basename of the source file. These switches may have different effects
-when @option{-E} is used for preprocessing.
+created in the directory of the output file. If the
+@option{=@var{filename}} is appended to the longer form of the dump
+option then the dump is done on that file instead of numbered
+files. Note that the pass number is computed statically as passes get
+registered into the pass manager.  Thus the numbering is not related
+to the dynamic order of execution of passes.  In particular, a pass
+installed by a plugin could have a number over 200 even if it executed
+quite early.  @var{dumpname} is generated from the name of the output
+file, if explicitly specified and it is not an executable, otherwise
+it is the basename of the source file. These switches may have
+different effects when @option{-E} is used for preprocessing.

 Debug dumps can be enabled with a @option{-fdump-rtl} switch or some
 @option{-d} option @var{letters}.  Here are the possible
@@ -5719,15 +5722,18 @@ counters for each function compiled.

 @item -fdump-tree-@var{switch}
 @itemx -fdump-tree-@var{switch}-@var{options}
+@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename}
 @opindex fdump-tree
 Control the dumping at various stages of processing the intermediate
 language tree to a file.  The file name is generated by appending a
 switch specific suffix to the source file name, and the file is
-created in the same directory as the output file.  If the
-@samp{-@var{options}} form is used, @var{options} is a list of
-@samp{-} separated options which control the details of the dump.  Not
-all options are applicable to all dumps; those that are not
-meaningful are ignored.  The following options are available
+created in the same directory as the output file. In case of
+@option{=@var{filename}} option, the dump is output on the given file
+name instead.  If the @samp{-@var{options}} form is used,
+@var{options} is a list of @samp{-} separated options which control
+the details or location of the dump.  Not all options are applicable
+to all dumps; those that are not meaningful are ignored.  The
+following options are available

 @table @samp
 @item address
@@ -5765,9 +5771,49 @@ Enable showing the tree dump for each statement.
 Enable showing the EH region number holding each statement.
 @item scev
 Enable showing scalar evolution analysis details.
+@item slim
+Inhibit dumping of members of a scope or body of a function merely
+because that scope has been reached.  Only dump such items when they
+are directly reachable by some other path.  When dumping pretty-printed
+trees, this option inhibits dumping the bodies of control structures.
+@item 

[MIPS] Fix misspelled macro in t-vxworks

2012-05-10 Thread Mingjie Xing
Hello,

This patch fix the misspelled macro in t-vxworks.  Is it OK?

2012-05-10  Mingjie Xing  mingjie.x...@gmail.com

* config/mips/t-vxworks: Change MUTLILIB_EXTRA_OPTS to
MULTILIB_EXTRA_OPTS.

Index: config/mips/t-vxworks
===
--- config/mips/t-vxworks   (revision 187364)
+++ config/mips/t-vxworks   (working copy)
@@ -32,4 +32,4 @@ MULTILIB_EXCEPTIONS = mips3* mabi=o64 fP
  $(addprefix mabi=o64/, EL* msoft-float* mrtp* fPIC*) \
  $(addsuffix /fPIC, *mabi=o64 *mips3 *EL *msoft-float)

-MUTLILIB_EXTRA_OPTS = -G 0 -mno-branch-likely
+MULTILIB_EXTRA_OPTS = -G 0 -mno-branch-likely

Thanks,
Mingjie


Re: [libgcc] Use i386-cpuinfo.c on all i386 targets

2012-05-10 Thread Rainer Orth
Paolo Bonzini bonz...@gnu.org writes:

 2012-04-26  Rainer Orth  r...@cebitec.uni-bielefeld.de
 
  libgcc:
  * config.host (i[34567]86-*-linux*, x86_64-*-linux*)
  (i[34567]86-*-kfreebsd*-gnu, x86_64-*-kfreebsd*-gnu)
  (i[34567]86-*-knetbsd*-gnu, i[34567]86-*-gnu*): Move
  i386/t-cpuinfo ...
  (i[34567]86-*-*, x86_64-*-*): ... here.
 
  * config/i386/libgcc-bsd.ver (GCC_4.8.0): New version.
  * config/i386/libgcc-sol2.ver (GCC_4.8.0): New version.
 
  * config/i386/i386-cpuinfo.c: Rename to ...
  * config/i386/cpuinfo.c: ... this.
  * config/i386/t-cpuinfo (LIB2ADD): Reflect this.
 
  * configure.ac (AC_CONFIG_HEADER): Call for auto-target.h.
  (libgcc_cv_init_priority): New test.
  * configure: Regenerate.
  * config.in: New file.
  * Makefile.in (clean): Rename config.h to auto-target.h.
  (config.h): Likewise.
  (stamp-h): Likewise.
 
  * config/i386/cpuinfo.c (auto-target.h): Include.
  (CONSTRUCTOR_PRIORITY): Define.
  (__cpu_indicator_init): Use it.
 
  gcc
  * config/i386/i386.c: Update comments for i386-cpuinfo.c name
 change.
 

 Looks good.

Given that there were no further comments, I've committed the patch
with the following doc snippet added, after bootstrapping on
i386-pc-solaris2.10 with as and x86_64-unknown-linux-gnu.

Thanks.
Rainer


* doc/extend.texi (X86 Built-in Functions, __builtin_cpu_init):
Document requirement to call in constructors.

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9432,8 +9432,9 @@ executed before any constructors are cal
 automatically executed in a very high priority constructor.
 
 For example, this function has to be used in @code{ifunc} resolvers which
-check for CPU type using the builtins, @code{__builtin_cpu_is}
-and @code{__builtin_cpu_supports}.
+check for CPU type using the builtins @code{__builtin_cpu_is}
+and @code{__builtin_cpu_supports}, or in constructors on targets which
+don't support constructor priority.
 @smallexample
 
 static void (*resolve_memcpy (void)) (void)


-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[libatomic] Always compile atomic builtin tests with $XCFLAGS (PR other/53284)

2012-05-10 Thread Rainer Orth
As described in the PR, several 32-bit libatomic tests FAIL on
Solaris/x86 with infinite recursion e.g. in
__atomic_compare_exchange_8.  It turns out that this happens because,
unlike on glibc targets, the atomic builtin configure tests are run as
compile tests, but are currently not compiled with $XCFLAGS, unlike the
real code.  The following patch fixes this, tested on
i386-pc-solaris2.10 and x86_64-unknown-linux-gnu, approved by rth in the
PR, installed on mainline.

Rainer


2012-05-09  Rainer Orth  r...@cebitec.uni-bielefeld.de

PR other/53284
* acinclude.m4 (LIBAT_TEST_ATOMIC_BUILTIN): Add -O0 -S to CFLAGS
instead of overriding.
* configure: Regenerate.

# HG changeset patch
# Parent 6c6136d0a9792bfd3fe9600f7867d5edf5c9b114
Always compile atomic builtin tests with $XCFLAGS

diff --git a/libatomic/acinclude.m4 b/libatomic/acinclude.m4
--- a/libatomic/acinclude.m4
+++ b/libatomic/acinclude.m4
@@ -67,7 +67,7 @@ AC_DEFUN([LIBAT_TEST_ATOMIC_BUILTIN],[
 else
   old_CFLAGS=$CFLAGS
   # Compile unoptimized.
-  CFLAGS='-O0 -S'
+  CFLAGS=$CFLAGS -O0 -S
   if AC_TRY_EVAL(ac_compile); then
 if grep __atomic_ conftest.s /dev/null 21 ; then
 	  eval $2=no


-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Remove TYPE_IS_SIZETYPE

2012-05-10 Thread Richard Guenther
On Thu, 10 May 2012, Richard Guenther wrote:

 On Wed, 9 May 2012, Eric Botcazou wrote:
 
   This removes the TYPE_IS_SIZETYPE macro and all its uses (by
   assuming it returns zero and applying trivial folding).  Sizes
   and bitsizes can still be treat specially by means of knowing
   what the values represent and by means of using helper functions
   that assume you are dealing with sizes (in particular size_binop
   and friends and bit_from_pos, byte_from_pos or pos_from_bit).
  
  Fine with me, if you add the blurb I talked about in the other reply.
  
   Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages
   including Ada with the patch optimizing bute_from_pos and pos_from_bit
  
  Results on our internal testsuite are clean on x86-64 and almost clean on 
  x86, 
  an exception being:
  
  package t is
  type x (m : natural) is record
  s : string (1 .. m);
  r : natural;
  b : boolean;
  end record;
  for x'alignment use 4;
  
  pragma Pack (x);
  end t;
  
  Without the patches, compiling the package with -gnatR3 yields:
  
  Representation information for unit t (spec)
  
  
  for x'Object_Size use 17179869248;
  for x'Value_Size use  ((#1 + 8) * 8) ;
  for x'Alignment use 4;
  for x use record
 m at  0 range  0 .. 30;
 s at  4 range  0 ..  ((#1 * 8))  - 1;
 r at bit offset (((#1 + 4) * 8))  size in bits = 31
 b at bit offset #1 + 7) * 8) + 7))  size in bits = 1
  end record;
  
  With the patches, this yields:
  
  Representation information for unit t (spec)
  
  
  for x'Object_Size use 17179869248;
  for x'Value_Size use  (((#1 + 7) + 1) * 8) ;
  for x'Alignment use 4;
  for x use record
 m at  0 range  0 .. 30;
 s at  4 range  0 ..  ((#1 * 8))  - 1;
 r at bit offset (((#1 + 4) * 8))  size in bits = 31
 b at bit offset #1 + 7) * 8) + 7))  size in bits = 1
  end record;
  
  so we have lost a simple folding for x'Value_Size (TYPE_ADA_SIZE field).
 
 That's interesting.  It is always safe to fold (x + 7) + 1 to
 (x + 8), independent on whether overflow is defined or not.  So this
 looks like a genuine missed folding (I think that the combiner
 in tree-ssa-forwprop.c catches this).  Or is the above not showing
 casts in the expression?  Folding would be not valid for
 (unsigned)(signed X + 7) + 1.

As far as I can see this happens when we fold

 (bitsizetype) (#1 + 7) * 8 + 7  PLUS_EXPR  1

which we fold to

 ((bitsizetype) (#1 + 7) + 1) * 8

The #1 + 7 expression is computed in sizetype (which is now unsigned
and thus has defined overflow - thus we cannot optimize the widening
to bitsizetype).

Equivalent C testcase:

unsigned long long foo (unsigned int x)
{
  return ((unsigned long long)(x + 7)) + 1;
}

As I previously suggested we can put in special knowledge into
size_binop, or maybe better, provide abstraction for conversion
of sizetype to bitsizetype that would associate the type
conversions.  The original plan was of course to at some point
have PLUSNV_EXPR so we can explicitely mark #1 + 7 as not
overflowing.  It might be that introducing those just for
size expressions right now (and then dropping them down
to regular PLUS_EXPRs during gimplification) might be
something to explore for 4.8.

Richard.

   2012-05-08  Richard Guenther  rguent...@suse.de
  
 ada/
 * gcc-interface/cuintp.c (UI_From_gnu): Remove TYPE_IS_SIZETYPE use.
  
  OK, modulo the formatting:
 
 Adjusted and applied.
 
 Thanks,
 Richard.
 
   Index: trunk/gcc/ada/gcc-interface/cuintp.c
   ===
   *** trunk.orig/gcc/ada/gcc-interface/cuintp.c 2011-04-11 
   17:01:30.0
   +0200 --- trunk/gcc/ada/gcc-interface/cuintp.c2012-05-07
   16:43:43.497218058 +0200 *** UI_From_gnu (tree Input)
   *** 178,186 
   if (host_integerp (Input, 0))
 return UI_From_Int (TREE_INT_CST_LOW (Input));
   else if (TREE_INT_CST_HIGH (Input)  0
   ! TYPE_UNSIGNED (gnu_type)
   ! !(TREE_CODE (gnu_type) == INTEGER_TYPE
   !  TYPE_IS_SIZETYPE (gnu_type)))
 return No_Uint;
 #endif
  
   --- 178,184 
   if (host_integerp (Input, 0))
 return UI_From_Int (TREE_INT_CST_LOW (Input));
   else if (TREE_INT_CST_HIGH (Input)  0
   ! TYPE_UNSIGNED (gnu_type))
 return No_Uint;
 #endif
  
   TYPE_UNSIGNED (gnu_type)) on the same line.
 

-- 
Richard Guenther rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Re: [PATCH] Optimize byte_from_pos, pos_from_bit

2012-05-10 Thread Eric Botcazou
 Like this?

Let's be a bit more factual. :-)

 /* Return the combined truncated byte position for the byte offset OFFSET and
the bit position BITPOS.

These functions operate on byte and bit positions present in FIELD_DECLs
and assume that these expressions result in no (intermediate) overflow.
This assumption is necessary to fold the expressions as much as possible,
so as to avoid creating artificially variable-sized types in languages
supporting variable-sized types like Ada.  */

-- 
Eric Botcazou


Re: [patch] Fix LTO regression in Ada

2012-05-10 Thread Eric Botcazou
 Hmm, but we will not possibly refer to the sizes therein, so emitting
 stmts for them
 looks pointless ... (and in fact the debug information would be odd,
 too ... what
 does dwarf2out.c do with these CALL_EXPRs when generating debug information
 without LTO?)

All variable-sized types currently get size -1, but this would change with:
  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00422.html

 Anyway, the idea is reasonable, but I'm not sure ending up with calls in
 those sizes makes sense (don't we make sure to inline them all at some
 point?)

These types are like any other types in Ada.  Either DECL_ORIGINAL_TYPE is 
purely for debug info and then we shouldn't stream it for LTO or it isn't and 
it needs to be gimplified.  My personal inclination would be for the former, 
but this is apparently problematic for C++.  I can add a ??? comment though.

-- 
Eric Botcazou


Re: [PATCH] Optimize byte_from_pos, pos_from_bit

2012-05-10 Thread Richard Guenther
On Thu, 10 May 2012, Eric Botcazou wrote:

  Like this?
 
 Let's be a bit more factual. :-)
 
  /* Return the combined truncated byte position for the byte offset OFFSET and
 the bit position BITPOS.
 
 These functions operate on byte and bit positions present in FIELD_DECLs
 and assume that these expressions result in no (intermediate) overflow.
 This assumption is necessary to fold the expressions as much as possible,
 so as to avoid creating artificially variable-sized types in languages
 supporting variable-sized types like Ada.  */

Works for me.

Applied that way.

Thanks,
Richard.


Re: [PATCH] Remove TYPE_IS_SIZETYPE

2012-05-10 Thread Eric Botcazou
 As far as I can see this happens when we fold

  (bitsizetype) (#1 + 7) * 8 + 7  PLUS_EXPR  1

 which we fold to

  ((bitsizetype) (#1 + 7) + 1) * 8

 The #1 + 7 expression is computed in sizetype (which is now unsigned
 and thus has defined overflow - thus we cannot optimize the widening
 to bitsizetype).

I see, thanks for the investigation.

 As I previously suggested we can put in special knowledge into
 size_binop, or maybe better, provide abstraction for conversion
 of sizetype to bitsizetype that would associate the type
 conversions.  The original plan was of course to at some point
 have PLUSNV_EXPR so we can explicitely mark #1 + 7 as not
 overflowing.  It might be that introducing those just for
 size expressions right now (and then dropping them down
 to regular PLUS_EXPRs during gimplification) might be
 something to explore for 4.8.

OK, I'll think about it.  No objections by me to going ahead with the patches.

-- 
Eric Botcazou


Re: [patch] Fix LTO regression in Ada

2012-05-10 Thread Richard Guenther
On Thu, May 10, 2012 at 12:12 PM, Eric Botcazou ebotca...@adacore.com wrote:
 Hmm, but we will not possibly refer to the sizes therein, so emitting
 stmts for them
 looks pointless ... (and in fact the debug information would be odd,
 too ... what
 does dwarf2out.c do with these CALL_EXPRs when generating debug information
 without LTO?)

 All variable-sized types currently get size -1, but this would change with:
  http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00422.html

 Anyway, the idea is reasonable, but I'm not sure ending up with calls in
 those sizes makes sense (don't we make sure to inline them all at some
 point?)

 These types are like any other types in Ada.  Either DECL_ORIGINAL_TYPE is
 purely for debug info and then we shouldn't stream it for LTO or it isn't and
 it needs to be gimplified.  My personal inclination would be for the former,
 but this is apparently problematic for C++.  I can add a ??? comment though.

Well, we need to stream it for LTO because at the moment LTO is still
responsible
for emitting debug information ... (that would change with the early
debug info plan).

Ok with a ??? comment.

Thanks,
Richard.

 --
 Eric Botcazou


Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin

2012-05-10 Thread Ramana Radhakrishnan
On 9 May 2012 11:18, Christophe Lyon christophe.l...@st.com wrote:
 Hello,

 On ARM+Neon, the expansion of vld1q_dup_s64() and vld1q_dup_u64() builtins
 currently fails to load the second vector element.

Thanks for the patch but this is not acceptable as it stands today.
You need to set the length attributes in this case to 8 for the
appropriate alternative at the very least. You also don't mention how
this patch was tested. Alternatively it might be worth splitting the
vld1q_*64 case into a 64 bit load into a (subreg:DI (V2DI reg)  0 )
followed by a subreg to subreg move which should end up having the
same effect . That splitting would allow for better instruction
scheduling. In addition it would be nice to have a testcase in
gcc.target/arm .

As a follow up patch I'd like these patterns merged with the vdup_n
patterns in neon.md (allowing them to grow a memory operand variant)
which should then allow merging of (I think)

scalarval = scalar_load ()
vreg = vdup ( scalarval)

into

vreg = vld1_dup_n ( scalar_address).

Thanks,
Ramana


Missing guard in ira-color.c ?

2012-05-10 Thread Tristan Gingold
Hi,

I am getting a segfault in ira-color.c:2945 on the trunk:

Program received signal SIGSEGV, Segmentation fault.
0x00a79f37 in move_spill_restore () at ../../src/gcc/ira-color.c:2945
2945  || ira_reg_equiv_const[regno] != NULL_RTX
(gdb) l
2940  /* don't do the optimization because it can create
2941 copies and the reload pass can spill the allocno set
2942 by copy although the allocno will not get memory
2943 slot.  */
2944  || ira_reg_equiv_invariant_p[regno]
2945  || ira_reg_equiv_const[regno] != NULL_RTX
2946  || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM 
(a)))
2947continue;
2948  mode = ALLOCNO_MODE (a);
2949  rclass = ALLOCNO_CLASS (a);

while building gcc (gnatcmd.adb file) for ia64-vms using a cross compiler 
(target=ia64-vms, host=x86_64-linux).

The reason looks to be an out of bounds access:

(gdb) print regno
$10 = 18476
(gdb) print ira_reg_equiv_len 
$11 = 17984

(I suppose this setup is not easy at all to reproduce, but I can provide any 
files, if necessary).

Wild guess, as I don't know IRA at all:  looks like in this file most accesses 
to ira_reg_equiv_* are guarded.  Is it expected that they aren't at this point ?

[I am currently trying with the following chunk:

--- a/gcc/ira-color.c
+++ b/gcc/ira-color.c
@@ -2941,8 +2941,9 @@ move_spill_restore (void)
 copies and the reload pass can spill the allocno set
 by copy although the allocno will not get memory
 slot.  */
- || ira_reg_equiv_invariant_p[regno]
- || ira_reg_equiv_const[regno] != NULL_RTX
+ || (regno  ira_reg_equiv_len
+  (ira_reg_equiv_invariant_p[regno]
+ || ira_reg_equiv_const[regno] != NULL_RTX))
  || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM (a)))
continue;
  mode = ALLOCNO_MODE (a);
]

Thanks for any comment,
Tristan.


Re: [C++ Patch] fix semi-random template specialization ICE

2012-05-10 Thread Jason Merrill

OK.

Jason


Re: [Patch / RFC] Improving more locations for binary expressions

2012-05-10 Thread Jason Merrill

Looks good.

Jason


Re: [C++ Patch] PR 53158 (EXPR_LOC_OR_HERE version)

2012-05-10 Thread Jason Merrill

On 05/09/2012 02:47 PM, Paolo Carlini wrote:

+ error_at(loc, cannot bind bitfield %qE to %qT,


Missing a space.

OK with that change.

Jason



Re: [C++ Patch] PR 53301

2012-05-10 Thread Paolo Carlini
Hi,

 On 05/09/2012 07:12 PM, Paolo Carlini wrote:
 shame on me. I think the patch almost qualifies as obvious.
 
 I think it does.  OK.

Good, later today I'll commit it (branch too).

Was thinking: would it make sense to have a predicate for 'any' pointer type? I 
see tens of such || around and I bet I would not have typoed it here... If you 
agree, please pick a name and I will do the work ;)

Paolo


Re: [C++ Patch] PR 53301

2012-05-10 Thread Jason Merrill

On 05/10/2012 10:52 AM, Paolo Carlini wrote:

Was thinking: would it make sense to have a predicate for 'any' pointer type?


Something like TYPE_PTR_OR_PTRMEM_P would be fine.

Hmm, I see that TYPE_PTRMEM_P only means pointer to data member, that's 
unfortunate; the name doesn't make that clear.


Jason


Re: [C++ Patch] PR 53301

2012-05-10 Thread Paolo Carlini
Hi,

 On 05/10/2012 10:52 AM, Paolo Carlini wrote:
 Was thinking: would it make sense to have a predicate for 'any' pointer type?
 
 Something like TYPE_PTR_OR_PTRMEM_P would be fine.

Good.

 Hmm, I see that TYPE_PTRMEM_P only means pointer to data member, that's 
 unfortunate; the name doesn't make that clear.

So, let's have a plan about the names of such predicates and I'll implement it 
as soon as possible. 

Paolo


Re: [h8300] increase dwarf address size

2012-05-10 Thread Jeff Law

On 05/09/2012 06:27 PM, DJ Delorie wrote:

H8/300 cpus have a larger-than-64k address space, despite 16-bit
pointers.  OK to apply?  Ok for 4.7 branch?

See also http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48231

* config/h8300/h8300.h (DWARF2_ADDR_SIZE): Define as 4 bytes.
My recollection was that the H8/300 only had a 64k address space and 
that the larger address spaces showed up in later processors (H8/300H).


Regardless, shouldn't DWARF2_ADDR_SIZE be POINTER_SIZE / BITS_PER_UNIT? 
 That'll give the larger DWARF2_ADDR_SIZE on the modern widgets, but 
still do the right thing for the ancient H8/300.


My other relevant recollection was that we don't support C++ on the 
H8/300 series.


jeff



Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin

2012-05-10 Thread Christophe Lyon

On 10.05.2012 13:41, Ramana Radhakrishnan wrote:

On 9 May 2012 11:18, Christophe Lyonchristophe.l...@st.com  wrote:

Hello,

On ARM+Neon, the expansion of vld1q_dup_s64() and vld1q_dup_u64() builtins
currently fails to load the second vector element.

Thanks for the patch but this is not acceptable as it stands today.
You need to set the length attributes in this case to 8 for the
appropriate alternative at the very least.

OK I'll look at this.


You also don't mention how this patch was tested.

I used the testsuite I developed some time ago to test all the Neon builtins, 
which I posted last year on the qemu mailing-list. With the current GCCs, this 
bug is the only remaining one I could detect.


  Alternatively it might be worth splitting the
vld1q_*64 case into a 64 bit load into a (subreg:DI (V2DI reg)  0 )
followed by a subreg to subreg move which should end up having the
same effect . That splitting would allow for better instruction
scheduling.

Are you aware of examples of similar cases I could use as a model?


  In addition it would be nice to have a testcase in
gcc.target/arm .

Well. Prior to sending my patch I did look at that directory, but I supposed 
that such a test ought to belong to the neon/ subdir where the tests are 
described as autogenerated. Any doc on how to do that?

Thanks,

Christophe.



Re: [C++ Patch] PR 53301

2012-05-10 Thread Manuel López-Ibáñez
On 10 May 2012 17:02, Paolo Carlini pcarl...@gmail.com wrote:
 Hi,

 On 05/10/2012 10:52 AM, Paolo Carlini wrote:
 Was thinking: would it make sense to have a predicate for 'any' pointer 
 type?

 Something like TYPE_PTR_OR_PTRMEM_P would be fine.

 Good.

 Hmm, I see that TYPE_PTRMEM_P only means pointer to data member, that's 
 unfortunate; the name doesn't make that clear.

 So, let's have a plan about the names of such predicates and I'll implement 
 it as soon as possible.

Yes, please. It feels as if the names are based more on the underlying
implementation of the macro than on anything else. Also, short names
are nice, but using MEM instead of MEMBER is a bit too short. The same
for OB for object and others.

PTR_OR_PTRMEM sounds to me like pointer or pointer to member, which
sounds redundant since a pointer to member is a pointer already.

And there is also TYPE_PTRMEM_P and TYPE_PTR_TO_MEMBER_P. From the
names it is not clear what is the difference. This could be
TYPE_PTR_TO_DATA_MEMBER and TYPE_PTR_TO_ANY_MEMBER. The few extra
chars help a lot to clarify the meaning.

Also tree.h already has POINTER_TYPE_P, what is the difference? There
are a few other such accessors where the names seem to match with
other accessors from cp-tree.h, but the implementations are a bit
different. And both forms are used in cp/. Quite a mess...

Cheers,

Manuel.


Re: [PATCH] ARM/NEON: vld1q_dup_s64 builtin

2012-05-10 Thread Julian Brown
On Thu, 10 May 2012 17:31:43 +0200
Christophe Lyon christophe.l...@st.com wrote:

 On 10.05.2012 13:41, Ramana Radhakrishnan wrote:
  On 9 May 2012 11:18, Christophe Lyonchristophe.l...@st.com  wrote:
  Hello,
 
  On ARM+Neon, the expansion of vld1q_dup_s64() and vld1q_dup_u64()
  builtins currently fails to load the second vector element.
  Thanks for the patch but this is not acceptable as it stands today.
  You need to set the length attributes in this case to 8 for the
  appropriate alternative at the very least.
 OK I'll look at this.
 
  You also don't mention how this patch was tested.
 I used the testsuite I developed some time ago to test all the Neon
 builtins, which I posted last year on the qemu mailing-list. With the
 current GCCs, this bug is the only remaining one I could detect.
 
Alternatively it might be worth splitting the
  vld1q_*64 case into a 64 bit load into a (subreg:DI (V2DI reg)  0 )
  followed by a subreg to subreg move which should end up having the
  same effect . That splitting would allow for better instruction
  scheduling.
 Are you aware of examples of similar cases I could use as a model?
 
In addition it would be nice to have a testcase in
  gcc.target/arm .
 Well. Prior to sending my patch I did look at that directory, but I
 supposed that such a test ought to belong to the neon/ subdir where
 the tests are described as autogenerated. Any doc on how to do that?

I'd recommend not to autogenerate such a test, FWIW -- the
autogenerated neon tests aren't very good. I think a manually-written
execute test would be better in this case.

If you do try autogenerating tests, look at Disassembles_as in
neon.ml, and neon-testgen.ml.

Julian


[Dwarf Patch] Improve pubnames and pubtypes generation. (issue6197069)

2012-05-10 Thread Sterling Augustine
The enclosed patch fixes many issues with pubnames and pubtypes. It generates
them for many more variables and with mostly correct and canonical dwarf names.

This patch should not affect any target that does not use pubnames.

The exceptions to the canonical names are addressed in a separate patch in
to the front end under review at 
http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00512.html.

Tested with bootstrap and running the test_pubnames_and_indices.py script
recently contributed to the GDB project.

OK for mainline?

Sterling

2012-05-10   Sterling Augustine  saugust...@google.com

* dwarf2out.c (DEBUG_PUBNAMES_SECTION_LABEL,
DEBUG_PUBTYPES_SECTION_LABEL): New macros.
(debug_pubnames_section_label, debug_pubtypes_section_label): New
globals.
(is_cu_die, is_namespace_die, is_class_die, add_AT_pubnames,
add_enumerator_pubname): New functions.
(add_pubname): Rework logic.  Call is_class_die, is_cu_die and
is_namespace_die.  Fix minor style violation.
(add_pubtype): Rework logic for calculating type name.  Call
is_namespace_die.
(output_pubnames): Move conditional logic deciding when to produce the
section from dwarf2out_finish.  Output debug_pubnames_section_label
and debug_pubtypes_section_label.
(base_type_die): Call add_pubtype.
(gen_enumeration_type_die): Unconditionally call add_pubtype.
(gen_namespace_die): Call add_pubname_string.
(dwarf2out_init): Generate debug_pubnames_section_label and
debug_pubtypes_section_label from DEBUG_PUBNAMES_SECTION_LABEL and
DEBUG_PUBTYPES_SECTION_LABEL respectively.
(dwarf2out_finish): Call add_AT_pubnames; Move logic on when to
produce pubnames and pubtypes sections to output_pubnames.

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (revision 187271)
+++ gcc/dwarf2out.c (working copy)
@@ -3007,6 +3007,7 @@ static void output_comp_unit (dw_die_ref, int);
 static void output_comdat_type_unit (comdat_type_node *);
 static const char *dwarf2_name (tree, int);
 static void add_pubname (tree, dw_die_ref);
+static void add_enumerator_pubname (const char *, dw_die_ref);
 static void add_pubname_string (const char *, dw_die_ref);
 static void add_pubtype (tree, dw_die_ref);
 static void output_pubnames (VEC (pubname_entry,gc) *);
@@ -3210,6 +3211,12 @@ static void gen_scheduled_generic_parms_dies (void
 #ifndef COLD_TEXT_SECTION_LABEL
 #define COLD_TEXT_SECTION_LABEL Ltext_cold
 #endif
+#ifndef DEBUG_PUBNAMES_SECTION_LABEL
+#define DEBUG_PUBNAMES_SECTION_LABEL   Ldebug_pubnames
+#endif
+#ifndef DEBUG_PUBTYPES_SECTION_LABEL
+#define DEBUG_PUBTYPES_SECTION_LABEL   Ldebug_pubtypes
+#endif
 #ifndef DEBUG_LINE_SECTION_LABEL
 #define DEBUG_LINE_SECTION_LABEL   Ldebug_line
 #endif
@@ -3246,6 +3253,8 @@ static char cold_end_label[MAX_ARTIFICIAL_LABEL_BY
 static char abbrev_section_label[MAX_ARTIFICIAL_LABEL_BYTES];
 static char debug_info_section_label[MAX_ARTIFICIAL_LABEL_BYTES];
 static char debug_line_section_label[MAX_ARTIFICIAL_LABEL_BYTES];
+static char debug_pubnames_section_label[MAX_ARTIFICIAL_LABEL_BYTES];
+static char debug_pubtypes_section_label[MAX_ARTIFICIAL_LABEL_BYTES];
 static char macinfo_section_label[MAX_ARTIFICIAL_LABEL_BYTES];
 static char loc_section_label[MAX_ARTIFICIAL_LABEL_BYTES];
 static char ranges_section_label[2 * MAX_ARTIFICIAL_LABEL_BYTES];
@@ -5966,6 +5975,22 @@ is_cu_die (dw_die_ref c)
   return c  c-die_tag == DW_TAG_compile_unit;
 }
 
+/* Returns true iff C is a namespace DIE.  */
+
+static inline bool
+is_namespace_die (dw_die_ref c)
+{
+  return c  c-die_tag == DW_TAG_namespace;
+}
+
+/* Returns true iff C is a class DIE.  */
+
+static inline bool
+is_class_die (dw_die_ref c)
+{
+  return c  c-die_tag == DW_TAG_class_type;
+}
+
 static char *
 gen_internal_sym (const char *prefix)
 {
@@ -8033,6 +8058,20 @@ output_comp_unit (dw_die_ref die, int output_if_em
 }
 }
 
+/* Add the DW_AT_GNU_pubnames and DW_AT_GNU_pubtypes attributes.  */
+
+static void
+add_AT_pubnames (dw_die_ref die)
+{
+  if (targetm.want_debug_pub_sections)
+{
+  /* FIXME: Should use add_AT_pubnamesptr.  This works because most targets
+ don't care what the base section is.  */
+  add_AT_lineptr (die, DW_AT_GNU_pubnames, debug_pubnames_section_label);
+  add_AT_lineptr (die, DW_AT_GNU_pubtypes, debug_pubtypes_section_label);
+}
+}
+
 /* Output a comdat type unit DIE and its children.  */
 
 static void
@@ -8116,14 +8155,32 @@ add_pubname_string (const char *str, dw_die_ref di
 static void
 add_pubname (tree decl, dw_die_ref die)
 {
-  if (targetm.want_debug_pub_sections  TREE_PUBLIC (decl))
+  if (!targetm.want_debug_pub_sections)
+return;
+
+  if ((TREE_PUBLIC (decl)  !is_class_die (die-die_parent))
+  || is_cu_die (die-die_parent) || is_namespace_die (die-die_parent))
 {
 

Re: [PATCH] Remove TYPE_IS_SIZETYPE

2012-05-10 Thread Eric Botcazou
 For example

 Index: stor-layout.c
 ===
 --- stor-layout.c (revision 187364)
 +++ stor-layout.c (working copy)
 @@ -791,6 +791,10 @@ start_record_layout (tree t)
  tree
  bit_from_pos (tree offset, tree bitpos)
  {
 +  if (TREE_CODE (offset) == PLUS_EXPR)
 +offset = size_binop (PLUS_EXPR,
 +  fold_convert (bitsizetype, TREE_OPERAND (offset, 0)),
 +  fold_convert (bitsizetype, TREE_OPERAND (offset, 1)));
return size_binop (PLUS_EXPR, bitpos,
size_binop (MULT_EXPR,
fold_convert (bitsizetype, offset),

 fixes the specific testcase you provided.

I get a bootstrap failure on x86 (verify_flow_info failed) with it.  Let's drop 
it for now, we'll revisit this later.

 I suppose if stor-layout.c would 
 be more carefully handle advancing offset/bitpos, avoding repeated
 translations between them, those issues would not exist.  Of course the
 mere existence of DECL_OFFSET_ALIGN complicates matters for no good reasons
 (well, at least I did not find a good use of it until now ...).

Maybe it's also obsolete by now.

-- 
Eric Botcazou


Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-05-10 Thread Xinliang David Li
I like your suggestion and support the end goal you have.  I don't
like the -fopt-info behavior to interfere with regular -fdump-xxx
options either.

I think we should stage the changes in multiple steps as originally
planned. Is Sharad's change good to be checked in for the first stage?

After this one is checked in, the new dump interfaces will be worked
on (and to allow multiple streams). Most of the remaining changes will
be massive text replacement.

thanks,

David


On Thu, May 10, 2012 at 1:18 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Thu, May 10, 2012 at 2:31 AM, Xinliang David Li davi...@google.com wrote:
 Bummer.  I was thinking to reserve '=' for selective  dumping:

 -fdump-tree-pre=func_list_regexp

 I guess this can be achieved via @

 -fdump-tree-pre@func_list

 -fdump-tree-pre=file_name@func_list


 Another issue -- I don't think the current precedence rule is correct.
 Consider that -fopt-info=2 will be mapped to

 -fdump-tree-all-transform-verbose2=stderr
 -fdump-rtl-all-transform-verbose2=stderr

 then

 the current precedence rule will cause surprise when the following is used

 -fopt-info -fdump-tree-pre

 The PRE dump will be emitted to stderr which is not what user wants.
 In short, special streams should be treated as 'weak' the same way as
 your previous implementation.

 Hm, this raises a similar concern I have with the -fvectorizer-verbose flag.
 With -fopt-info -fdump-tree-pre I do not want some information to be
 present only on stderr or in the dump file!  I want it in _both_ places!
 (-fvectorizer-verbose makes the -fdump-tree-vect dump contain less
 information :()

 Thus, the information where dumping goes has to be done differently
 (which is why I asked for some re-org originally, so that passes no
 longer explicitely reference dump_file - dump_file may be different
 for different kind of information it dumps!).  Passes should, instead of

  fprintf (dump_file, ..., ...)

 do

  dump_printf (TDF_scev, ..., ...)

 thus, specify the kind of information they dump (would be mostly
 TDF_details vs. 0 today I guess).  The dump_printf routine would
 then properly direct to one or more places to dump at.

 I realize this needs some more dispatchers for dumping expressions
 and statements (but it should not be too many).  Dumping to
 dump_file would in any case dump to the passes private dump file
 only (unqualified stuff would never be useful for -fopt-info).

 The perfect candidate to convert to this kind of scheme is obviously
 the vectorizer with its existing -fvectorizer-verbose.

 If the patch doesn't work towards this kind of end-result I'd rather
 not have it.

 Thanks,
 Richard.

 thanks,

 David



 On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai sing...@google.com wrote:
 Thanks for your suggestions/comments. I have updated the patch and
 documentation. It supports the following usage:

 gcc  -fdump-tree-all=tree.dump -fdump-tree-pre=stdout
 -fdump-rtl-ira=ira.dump

 Here all tree dumps except the PRE are output into tree.dump, PRE dump
 goes to stdout and the IRA dump goes to ira.dump.

 Thanks,
 Sharad

 2012-05-09   Sharad Singhai  sing...@google.com

        * doc/invoke.texi: Add documentation for the new option.
        * tree-dump.c (dump_get_standard_stream): New function.
        (dump_files): Update for new field.
        (dump_switch_p_1): Handle dump filenames.
        (dump_begin): Likewise.
        (get_dump_file_name): Likewise.
        (dump_end): Remove attribute.
        (dump_enable_all): Add new parameter FILENAME.
        All callers updated.
        (enable_rtl_dump_file):
        * tree-pass.h (enum tree_dump_index): Add new constant.
        (struct dump_file_info): Add new field FILENAME.
        * testsuite/g++.dg/other/dump-filename-1.C: New test.

 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi     (revision 187265)
 +++ doc/invoke.texi     (working copy)
 @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio

  @item -d@var{letters}
  @itemx -fdump-rtl-@var{pass}
 +@itemx -fdump-rtl-@var{pass}=@var{filename}
  @opindex d
  Says to make debugging dumps during compilation at times specified by
  @var{letters}.  This is used for debugging the RTL-based passes of the
  compiler.  The file names for most of the dumps are made by appending
  a pass number and a word to the @var{dumpname}, and the files are
 -created in the directory of the output file.  Note that the pass
 -number is computed statically as passes get registered into the pass
 -manager.  Thus the numbering is not related to the dynamic order of
 -execution of passes.  In particular, a pass installed by a plugin
 -could have a number over 200 even if it executed quite early.
 -@var{dumpname} is generated from the name of the output file, if
 -explicitly specified and it is not an executable, otherwise it is the
 -basename of the source file. These switches may have different 

[PATCH, 4.7] Backport fix to [un]signed_type_for

2012-05-10 Thread William J. Schmidt
Backporting this patch to 4.7 fixes a problem building Fedora 17.
Bootstrapped and regression tested on powerpc64-unknown-linux-gnu.  Is
the backport OK?

Thanks,
Bill


2012-05-10  Bill Schmidt  wschm...@vnet.linux.ibm.com

Backport from trunk:
2012-03-12  Richard Guenther  rguent...@suse.de

* tree.c (signed_or_unsigned_type_for): Use
build_nonstandard_integer_type.
(signed_type_for): Adjust documentation.
(unsigned_type_for): Likewise.
* tree-pretty-print.c (dump_generic_node): Use standard names
for non-standard integer types if available.


Index: gcc/tree-pretty-print.c
===
--- gcc/tree-pretty-print.c (revision 187368)
+++ gcc/tree-pretty-print.c (working copy)
@@ -723,11 +723,41 @@ dump_generic_node (pretty_printer *buffer, tree no
  }
else if (TREE_CODE (node) == INTEGER_TYPE)
  {
-   pp_string (buffer, (TYPE_UNSIGNED (node)
-   ? unnamed-unsigned:
-   : unnamed-signed:));
-   pp_decimal_int (buffer, TYPE_PRECISION (node));
-   pp_string (buffer, );
+   if (TYPE_PRECISION (node) == CHAR_TYPE_SIZE)
+ pp_string (buffer, (TYPE_UNSIGNED (node)
+ ? unsigned char
+ : signed char));
+   else if (TYPE_PRECISION (node) == SHORT_TYPE_SIZE)
+ pp_string (buffer, (TYPE_UNSIGNED (node)
+ ? unsigned short
+ : signed short));
+   else if (TYPE_PRECISION (node) == INT_TYPE_SIZE)
+ pp_string (buffer, (TYPE_UNSIGNED (node)
+ ? unsigned int
+ : signed int));
+   else if (TYPE_PRECISION (node) == LONG_TYPE_SIZE)
+ pp_string (buffer, (TYPE_UNSIGNED (node)
+ ? unsigned long
+ : signed long));
+   else if (TYPE_PRECISION (node) == LONG_LONG_TYPE_SIZE)
+ pp_string (buffer, (TYPE_UNSIGNED (node)
+ ? unsigned long long
+ : signed long long));
+   else if (TYPE_PRECISION (node) = CHAR_TYPE_SIZE
+ exact_log2 (TYPE_PRECISION (node)))
+ {
+   pp_string (buffer, (TYPE_UNSIGNED (node) ? uint : int));
+   pp_decimal_int (buffer, TYPE_PRECISION (node));
+   pp_string (buffer, _t);
+ }
+   else
+ {
+   pp_string (buffer, (TYPE_UNSIGNED (node)
+   ? unnamed-unsigned:
+   : unnamed-signed:));
+   pp_decimal_int (buffer, TYPE_PRECISION (node));
+   pp_string (buffer, );
+ }
  }
else if (TREE_CODE (node) == COMPLEX_TYPE)
  {
Index: gcc/tree.c
===
--- gcc/tree.c  (revision 187368)
+++ gcc/tree.c  (working copy)
@@ -10162,32 +10162,26 @@ widest_int_cst_value (const_tree x)
   return val;
 }
 
-/* If TYPE is an integral type, return an equivalent type which is
-unsigned iff UNSIGNEDP is true.  If TYPE is not an integral type,
-return TYPE itself.  */
+/* If TYPE is an integral or pointer type, return an integer type with
+   the same precision which is unsigned iff UNSIGNEDP is true, or itself
+   if TYPE is already an integer type of signedness UNSIGNEDP.  */
 
 tree
 signed_or_unsigned_type_for (int unsignedp, tree type)
 {
-  tree t = type;
-  if (POINTER_TYPE_P (type))
-{
-  /* If the pointer points to the normal address space, use the
-size_type_node.  Otherwise use an appropriate size for the pointer
-based on the named address space it points to.  */
-  if (!TYPE_ADDR_SPACE (TREE_TYPE (t)))
-   t = size_type_node;
-  else
-   return lang_hooks.types.type_for_size (TYPE_PRECISION (t), unsignedp);
-}
+  if (TREE_CODE (type) == INTEGER_TYPE  TYPE_UNSIGNED (type) == unsignedp)
+return type;
 
-  if (!INTEGRAL_TYPE_P (t) || TYPE_UNSIGNED (t) == unsignedp)
-return t;
+  if (!INTEGRAL_TYPE_P (type)
+   !POINTER_TYPE_P (type))
+return NULL_TREE;
 
-  return lang_hooks.types.type_for_size (TYPE_PRECISION (t), unsignedp);
+  return build_nonstandard_integer_type (TYPE_PRECISION (type), unsignedp);
 }
 
-/* Returns unsigned variant of TYPE.  */
+/* If TYPE is an integral or pointer type, return an integer type with
+   the same precision which is unsigned, or itself if TYPE is already an
+   unsigned integer type.  */
 
 tree
 unsigned_type_for 

Re: [PATCH, 4.7] Backport fix to [un]signed_type_for

2012-05-10 Thread Jakub Jelinek
On Thu, May 10, 2012 at 11:44:27AM -0500, William J. Schmidt wrote:
 Backporting this patch to 4.7 fixes a problem building Fedora 17.
 Bootstrapped and regression tested on powerpc64-unknown-linux-gnu.  Is
 the backport OK?

For 4.7 I'd very much prefer a less intrusive change (i.e. change
the java langhook) instead, but I'll defer to Richard if he prefers
this over that.

 2012-05-10  Bill Schmidt  wschm...@vnet.linux.ibm.com
 
   Backport from trunk:
   2012-03-12  Richard Guenther  rguent...@suse.de
 
   * tree.c (signed_or_unsigned_type_for): Use
   build_nonstandard_integer_type.
   (signed_type_for): Adjust documentation.
   (unsigned_type_for): Likewise.
   * tree-pretty-print.c (dump_generic_node): Use standard names
   for non-standard integer types if available.

Jakub


Re: [PATCH, 4.7] Backport fix to [un]signed_type_for

2012-05-10 Thread William J. Schmidt
On Thu, 2012-05-10 at 18:49 +0200, Jakub Jelinek wrote:
 On Thu, May 10, 2012 at 11:44:27AM -0500, William J. Schmidt wrote:
  Backporting this patch to 4.7 fixes a problem building Fedora 17.
  Bootstrapped and regression tested on powerpc64-unknown-linux-gnu.  Is
  the backport OK?
 
 For 4.7 I'd very much prefer a less intrusive change (i.e. change
 the java langhook) instead, but I'll defer to Richard if he prefers
 this over that.

OK.  If that's desired, this is the possible change to the langhook:

Index: gcc/java/typeck.c
===
--- gcc/java/typeck.c   (revision 187158)
+++ gcc/java/typeck.c   (working copy)
@@ -189,6 +189,12 @@ java_type_for_size (unsigned bits, int unsignedp)
 return unsignedp ? unsigned_int_type_node : int_type_node;
   if (bits = TYPE_PRECISION (long_type_node))
 return unsignedp ? unsigned_long_type_node : long_type_node;
+  /* A 64-bit target with TImode requires 128-bit type definitions
+ for bitsizetype.  */
+  if (int128_integer_type_node
+   bits == TYPE_PRECISION (int128_integer_type_node))
+return (unsignedp ? int128_unsigned_type_node
+   : int128_integer_type_node);
   return 0;
 }

which also fixed the problem and bootstraps without regressions.
Whichever you guys prefer is fine with me.

Thanks,
Bill
 
  2012-05-10  Bill Schmidt  wschm...@vnet.linux.ibm.com
  
  Backport from trunk:
  2012-03-12  Richard Guenther  rguent...@suse.de
  
  * tree.c (signed_or_unsigned_type_for): Use
  build_nonstandard_integer_type.
  (signed_type_for): Adjust documentation.
  (unsigned_type_for): Likewise.
  * tree-pretty-print.c (dump_generic_node): Use standard names
  for non-standard integer types if available.
 
   Jakub
 



Re: [h8300] increase dwarf address size

2012-05-10 Thread DJ Delorie

 Regardless, shouldn't DWARF2_ADDR_SIZE be POINTER_SIZE / BITS_PER_UNIT? 

That's the default.  It doesn't work because pointers are still 16 bits.


Re: PR 43772 Errant -Wlogical-op warning when testing limits

2012-05-10 Thread Dodji Seketeli
Sorry for my late reply.

Manuel López-Ibáñez lopeziba...@gmail.com writes:

 This patch fixes almost all false positives in PR43772. The case not fixed is:

   intmax_t i = (whatever);
   if (INT_MAX  i  i = LONG_MAX)
  print (i is in 'long' but not 'int' ran)

 where we warn if INT_MAX = LONG_MAX  INTMAX_MAX.

FWIW, I'd be inclined to warn in that case, unless someone comes with a
reasonable scenario that argues for the usefulness of not warning here.
Maybe I am missing something.

 Perhaps with the macro location code, we could now tell that the
 constants INT_MAX and LONG_MAX come from different macro expansions in
 system headers, and avoid warning in this specific case, but that
 would be better done in a follow-up patch. 

Hmmh.

 Dodji, is that possible?
 how could it be done?

It might be possible, even if I doubt if the value of doing that really
offsets the cost the perceived weirdness of the approach.

Assuming the token for resulting from INT_MAX and LONG_MAX haven't been
folded, you could get the line maps of their locations by using
linemap_lookup (line_table, location_of_token).

Then, make if linemap_macro_expansion_map_p is true on the the two line
maps, it means the tokens for INT_MAX and LONG_MAX come from macro
expansions.  Then if the maps are different, it means the macro
expansions are different.  To know if they (the macros) come from system
headers, you can use the predicate LINEMAP_SYSP on them.

But if INT_MAX/LONG_MAX are folded into a constant, then the information
about their macro-ness is lost, unfortunately.

-- 
Dodji


Re: User directed Function Multiversioning via Function Overloading (issue5752064)

2012-05-10 Thread H.J. Lu
On Wed, May 9, 2012 at 12:01 PM, Sriraman Tallam tmsri...@google.com wrote:
 Hi,

 Attached new patch with more bug fixes. I will fix the dispatching
 method to use prioirty of attributes in the next iteration.

 Patch also available for review here:  http://codereview.appspot.com/5752064


The patch looks OK to me.  Since testcase depends on the dispatching
method,  I'd like to see the whole patch with the updated dispatching
method.

Thanks.

-- 
H.J.


Re: [MIPS] Fix misspelled macro in t-vxworks

2012-05-10 Thread Richard Sandiford
Mingjie Xing mingjie.x...@gmail.com writes:
 This patch fix the misspelled macro in t-vxworks.  Is it OK?

 2012-05-10  Mingjie Xing  mingjie.x...@gmail.com

 * config/mips/t-vxworks: Change MUTLILIB_EXTRA_OPTS to
 MULTILIB_EXTRA_OPTS.

OK, thanks.

Richard


[PATCH, i386]: Further move insn modes cleanup.

2012-05-10 Thread Uros Bizjak
Hello!

Introduce handling of TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL and
TARGET_SSE_TYPELESS_STORES flags to movoi, movti and movtf move
patterns. Also introduce ssePSmode attribute to determine PSmode at
compile time.

2012-05-10  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (*movoi_internal_avx): Handle
TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL and TARGET_SSE_TYPELESS_STORES.
(*movti_internal_rex64): Handle TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL.
(*movti_internal_sse): Ditto.
(*movtf_internal): Ditto.
* config/i386/sse.md (ssePSmode): New mode attribute.
(*movemode_internal): Use ssePSmode.
(*sse_movussemodesuffixavxsizesuffix): Ditto.
(*sse2_movdquavxsizesuffix): Ditto.
* config/i386/i386.c (standard_sse_constant_opcode): Do not handle
TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL here.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 187354)
+++ config/i386/sse.md  (working copy)
@@ -337,6 +337,16 @@
(V8SF V4SF) (V4DF V2DF)
(V4SF V2SF)])
 
+;; Mapping of vector modes ti packed single mode of the same size
+(define_mode_attr ssePSmode
+  [(V32QI V8SF) (V16QI V4SF)
+   (V16HI V8SF) (V8HI V4SF)
+   (V8SI V8SF) (V4SI V4SF)
+   (V4DI V8SF) (V2DI V4SF)
+   (V2TI V8SF) (V1TI V4SF)
+   (V8SF V8SF) (V4SF V4SF)
+   (V4DF V8SF) (V2DF V4SF)])
+
 ;; Mapping of vector modes back to the scalar modes
 (define_mode_attr ssescalarmode
   [(V32QI QI) (V16HI HI) (V8SI SI) (V4DI DI)
@@ -420,7 +430,7 @@
 })
 
 (define_insn *movmode_internal
-  [(set (match_operand:V16 0 nonimmediate_operand =x,x ,m)
+  [(set (match_operand:V16 0 nonimmediate_operand   =x,x ,m)
(match_operand:V16 1 nonimmediate_or_sse_const_operand  C ,xm,x))]
   TARGET_SSE
 (register_operand (operands[0], MODEmode)
@@ -471,21 +481,18 @@
   [(set_attr type sselog1,ssemov,ssemov)
(set_attr prefix maybe_vex)
(set (attr mode)
-   (cond [(and (eq_attr alternative 1,2)
-   (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL))
-(if_then_else
-   (match_test GET_MODE_SIZE (MODEmode)  16)
-   (const_string V8SF)
-   (const_string V4SF))
+   (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
+(const_string ssePSmode)
+  (and (eq_attr alternative 2)
+   (match_test TARGET_SSE_TYPELESS_STORES))
+(const_string ssePSmode)
   (match_test TARGET_AVX)
 (const_string sseinsnmode)
-  (ior (and (eq_attr alternative 1,2)
-(match_test optimize_function_for_size_p (cfun)))
-   (and (eq_attr alternative 2)
-(match_test TARGET_SSE_TYPELESS_STORES)))
+  (ior (not (match_test TARGET_SSE2))
+   (match_test optimize_function_for_size_p (cfun)))
 (const_string V4SF)
  ]
- (const_string sseinsnmode)))])
+ (const_string sseinsnmode)))])
 
 (define_insn sse2_movq128
   [(set (match_operand:V2DI 0 register_operand =x)
@@ -610,18 +617,16 @@
(set_attr prefix maybe_vex)
(set (attr mode)
(cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-(if_then_else
-   (match_test GET_MODE_SIZE (MODEmode)  16)
-   (const_string V8SF)
-   (const_string V4SF))
+(const_string ssePSmode)
+  (and (eq_attr alternative 1)
+   (match_test TARGET_SSE_TYPELESS_STORES))
+(const_string ssePSmode)
   (match_test TARGET_AVX)
 (const_string MODE)
-  (ior (match_test optimize_function_for_size_p (cfun))
-   (and (eq_attr alternative 1)
-(match_test TARGET_SSE_TYPELESS_STORES)))
-(const_string V4SF)
+  (match_test optimize_function_for_size_p (cfun))
+(const_string V4SF)
  ]
-   (const_string MODE)))])
+ (const_string MODE)))])
 
 (define_expand sse2_movdquavxsizesuffix
   [(set (match_operand:VI1 0 nonimmediate_operand)
@@ -658,18 +663,16 @@
(set_attr prefix maybe_vex)
(set (attr mode)
(cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-(if_then_else
-   (match_test GET_MODE_SIZE (MODEmode)  16)
-   (const_string V8SF)
-   (const_string V4SF))
+(const_string ssePSmode)
+  (and (eq_attr alternative 1)
+   (match_test TARGET_SSE_TYPELESS_STORES))
+(const_string ssePSmode)
   (match_test TARGET_AVX)
 (const_string sseinsnmode)
-  (ior (match_test 

patch for PR53125

2012-05-10 Thread Vladimir Makarov

The following patch is for PR53125.  The PR is described on
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53125.

The patch improves the compilation speed by 35% for the case.

The patch was successfully bootstrapped on x86-64.

Committed as rev. 187373.

2012-05-10  Vladimir Makarov vmaka...@redhat.com

PR rtl-optimization/53125
* ira.c (ira): Call find_moveable_pseudos and
move_unallocated_pseudos if only ira_conflicts_p is true.



[i386] New testcase (was: [rtl, patch] combine concat+shuffle)

2012-05-10 Thread Marc Glisse

Hello,

could an i386 maintainer take a look at the following testcase?

gcc/testsuite/ChangeLog
2012-05-08  Marc Glisse  marc.gli...@inria.fr

* gcc.target/i386/shuf-concat.c: New test.


--- gcc.target/i386/shuf-concat.c   (revision 0)
+++ gcc.target/i386/shuf-concat.c   (revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O -msse2 -mfpmath=sse } */
+
+typedef double v2df __attribute__ ((__vector_size__ (16)));
+
+v2df f(double d,double e){
+  v2df x={-d,d};
+  v2df y={-e,e};
+  return __builtin_ia32_shufpd(x,y,1);
+}
+
+/* { dg-final { scan-assembler-not \tv?shufpd\t } } */
+/* { dg-final { scan-assembler-times \tv?unpcklpd\t 1 } } */


The conversation on this patch started at 
http://gcc.gnu.org/ml/gcc-patches/2012-05/msg00504.html



On Tue, 8 May 2012, Marc Glisse wrote:


On Tue, 8 May 2012, Richard Sandiford wrote:


Marc Glisse marc.gli...@inria.fr writes:

Here is a new version.

gcc/ChangeLog
2012-05-08  Marc Glisse  marc.gli...@inria.fr

* simplify-rtx.c (simplify_binary_operation_1): Optimize shuffle
of concatenations.


OK, thanks.  I'll leave an x86 maintainer to review the testcase,
but it looks like it'll need some markup to ensure an SSE target.


Oups, I'd thought about that, then completely forgot. For 64 bits, it always 
works. For 32 bits, it requires -msse2 -mfpmath=sse (without -mfpmath=sse we 
can still test for shufpd, but apparently not unpcklpd, I could remove that 
second test if people prefer, as it isn't important). Since this is a 
compile-only test, I think this would be enough:


/* { dg-options -O -msse2 -mfpmath=sse } */


Note to self: if you want to grep for shuf in the asm, don't put shuf
in the name of the file...


Yeah :-)  For MIPS tests I tend to add \t to the beginning of the regexp.
(And to the end if possible.)


Good idea. I was trying to make the check as wide as possible, but that's not 
so useful. Attached a new version of the testcase.


--
Marc Glisse


Re: Symbol table 20/many: cleanup of cgraph_remove_unreachable_nodes

2012-05-10 Thread Jan Hubicka
Hi,
after some thought, the changes into omp-low are not as obviously harmless as I
originally tought.  So i decided to handle this by separate patch.  This patch
simply makes cgraph to not release bodies of artificial functions that papers
around the problem in easier way.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* cgraph.h (cgraph_remove_unreachable_nodes): Rename to ...
(symtab_remove_unreachable_nodes): ... this one.
* ipa-cp.c (ipcp_driver): Do not remove unreachable nodes.
* cgraphunit.c (ipa_passes): Update.
* cgraphclones.c (cgraph_materialize_all_clones): Update.
* cgraph.c (cgraph_release_function_body): Only turn initial
into error mark when initial was previously set.
* ipa-inline.c (ipa_inline): Update.
* ipa.c: Include ipa-inline.h
(enqueue_cgraph_node, enqueue_varpool_node): Remove.
(enqueue_node): New function.
(process_references): Update.
(symtab_remove_unreachable_nodes): Cleanup.
* passes.c (execute_todo, execute_one_pass): Update.
Index: cgraph.c
===
*** cgraph.c(revision 187335)
--- cgraph.c(working copy)
*** cgraph_release_function_body (struct cgr
*** 1162,1168 
/* If the node is abstract and needed, then do not clear DECL_INITIAL
   of its associated function function declaration because it's
   needed to emit debug info later.  */
!   if (!node-abstract_and_needed)
  DECL_INITIAL (node-symbol.decl) = error_mark_node;
  }
  
--- 1162,1168 
/* If the node is abstract and needed, then do not clear DECL_INITIAL
   of its associated function function declaration because it's
   needed to emit debug info later.  */
!   if (!node-abstract_and_needed  DECL_INITIAL (node-symbol.decl))
  DECL_INITIAL (node-symbol.decl) = error_mark_node;
  }
  
Index: cgraph.h
===
*** cgraph.h(revision 187335)
--- cgraph.h(working copy)
*** int compute_call_stmt_bb_frequency (tree
*** 637,643 
  void record_references_in_initializer (tree, bool);
  
  /* In ipa.c  */
! bool cgraph_remove_unreachable_nodes (bool, FILE *);
  cgraph_node_set cgraph_node_set_new (void);
  cgraph_node_set_iterator cgraph_node_set_find (cgraph_node_set,
   struct cgraph_node *);
--- 637,643 
  void record_references_in_initializer (tree, bool);
  
  /* In ipa.c  */
! bool symtab_remove_unreachable_nodes (bool, FILE *);
  cgraph_node_set cgraph_node_set_new (void);
  cgraph_node_set_iterator cgraph_node_set_find (cgraph_node_set,
   struct cgraph_node *);
Index: ipa-cp.c
===
*** ipa-cp.c(revision 187335)
--- ipa-cp.c(working copy)
*** ipcp_driver (void)
*** 2445,2451 
struct cgraph_2edge_hook_list *edge_duplication_hook_holder;
struct topo_info topo;
  
-   cgraph_remove_unreachable_nodes (true,dump_file);
ipa_check_create_node_params ();
ipa_check_create_edge_args ();
grow_next_edge_clone_vector ();
--- 2445,2450 
Index: cgraphunit.c
===
*** cgraphunit.c(revision 187335)
--- cgraphunit.c(working copy)
*** ipa_passes (void)
*** 1836,1842 
   because TODO is run before the subpasses.  It is important to remove
   the unreachable functions to save works at IPA level and to get LTO
   symbol tables right.  */
!   cgraph_remove_unreachable_nodes (true, cgraph_dump_file);
  
/* If pass_all_early_optimizations was not scheduled, the state of
   the cgraph will not be properly updated.  Update it now.  */
--- 1836,1842 
   because TODO is run before the subpasses.  It is important to remove
   the unreachable functions to save works at IPA level and to get LTO
   symbol tables right.  */
!   symtab_remove_unreachable_nodes (true, cgraph_dump_file);
  
/* If pass_all_early_optimizations was not scheduled, the state of
   the cgraph will not be properly updated.  Update it now.  */
*** compile (void)
*** 1962,1968 
  
/* This pass remove bodies of extern inline functions we never inlined.
   Do this later so other IPA passes see what is really going on.  */
!   cgraph_remove_unreachable_nodes (false, dump_file);
cgraph_global_info_ready = true;
if (cgraph_dump_file)
  {
--- 1962,1968 
  
/* This pass remove bodies of extern inline functions we never inlined.
   Do this later so other IPA passes see what is really going on.  */
!   symtab_remove_unreachable_nodes (false, dump_file);
cgraph_global_info_ready = true;
if (cgraph_dump_file)
  {
*** compile (void)
*** 1987,1993 

Re: [PATCH, i386] V4DF __builtin_shuffle

2012-05-10 Thread Marc Glisse

Any comment?

On Mon, 30 Apr 2012, Marc Glisse wrote:


Ping?

http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01034.html

Since then, I've run a c,c++ bootstrap and:
make -k check RUNTESTFLAGS=--target_board=my-sde-sim
where my-sde-sim is the dejagnu board posted by H.J. Lu to run tests inside 
Intel's simulator, no difference between before and after my patch.
(If I understand correctly, the testsuite always compiles the AVX and AVX2 
tests, and uses cpuid (which I expect the simulator must fake) to determine 
if it should run them, so I don't need to pass any extra flag in 
RUNTESTFLAGS. If I am wrong, please tell me.)


Adding in Cc: the 2 people who kindly commented on the other shuffle patch 
(the one that isn't finished).


On Tue, 17 Apr 2012, Marc Glisse wrote:


Hello,

this patch expands __builtin_shuffle for V4DF mode in at most 3 insn. It is 
simple and works really well, often generates only 2 insn. It is not very 
generic, because other modes don't have an instruction equivalent to 
vshufpd. For V8SF (and likely V4DI and V8SI with AVX2, but I still need to 
do that), my patch default case in PR 52607 seems more interesting.


I tried calling this new function after expand_vec_perm_vperm2f128_vblend 
(instead of before as in the patch), but it generated more instructions for 
some permutations, and never less. That function is still useful for V8SF 
though.


I bootstrapped gcc on a non-avx platform, compiled a program that tests all 
4096 shuffles with -mavx/-mavx2, and ran the result using Intel's emulator 
(SDE).


There are still a few V4DF permutations that don't generate an optimal 
sequence (3 insn instead of 2), but not that many I think. Of course, I am 
assuming a constant cost of 1 per insn, which is completely false, but 
seems like a sensible first approximation.


(note that I can't commit)


2012-04-17  Marc Glisse  marc.gli...@inria.fr

PR target/502607
* config/i386/i386.c (ix86_expand_vec_perm_const): Move code to ...
(canonicalize_perm): ... new function.
(expand_vec_perm_2vperm2f128_vshuf): New function.
(ix86_expand_vec_perm_const_1): Call it.


--
Marc Glisse


Re: Missing guard in ira-color.c ?

2012-05-10 Thread Vladimir Makarov

On 05/10/2012 09:10 AM, Tristan Gingold wrote:

Hi,

I am getting a segfault in ira-color.c:2945 on the trunk:

Program received signal SIGSEGV, Segmentation fault.
0x00a79f37 in move_spill_restore () at ../../src/gcc/ira-color.c:2945
2945  || ira_reg_equiv_const[regno] != NULL_RTX
(gdb) l
2940  /* don't do the optimization because it can create
2941 copies and the reload pass can spill the allocno set
2942 by copy although the allocno will not get memory
2943 slot.  */
2944  || ira_reg_equiv_invariant_p[regno]
2945  || ira_reg_equiv_const[regno] != NULL_RTX
2946  || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM 
(a)))
2947continue;
2948  mode = ALLOCNO_MODE (a);
2949  rclass = ALLOCNO_CLASS (a);

while building gcc (gnatcmd.adb file) for ia64-vms using a cross compiler 
(target=ia64-vms, host=x86_64-linux).

The reason looks to be an out of bounds access:

(gdb) print regno
$10 = 18476
(gdb) print ira_reg_equiv_len
$11 = 17984

(I suppose this setup is not easy at all to reproduce, but I can provide any 
files, if necessary).

Tristan, thanks for reporting this.

Wild guess, as I don't know IRA at all:  looks like in this file most accesses 
to ira_reg_equiv_* are guarded.  Is it expected that they aren't at this point ?
Yes, I guess.  It is possible to have the pseudos which are out of range 
ira_reg_equiv_const.  It should be hard to reproduce such error because 
they are generated when we need to break circular dependence (e.g. when 
hard register 1 should be moved to hard register 2 and hard register 2 
to hard register 1).


Your solution is perfectly fine.  So you can commit the patch into the 
trunk as pre-approved.


Thanks again.


[I am currently trying with the following chunk:

--- a/gcc/ira-color.c
+++ b/gcc/ira-color.c
@@ -2941,8 +2941,9 @@ move_spill_restore (void)
  copies and the reload pass can spill the allocno set
  by copy although the allocno will not get memory
  slot.  */
- || ira_reg_equiv_invariant_p[regno]
- || ira_reg_equiv_const[regno] != NULL_RTX
+ || (regno  ira_reg_equiv_len
+  (ira_reg_equiv_invariant_p[regno]
+ || ira_reg_equiv_const[regno] != NULL_RTX))
   || !bitmap_bit_p (loop_node-border_allocnos, ALLOCNO_NUM (a)))
 continue;
   mode = ALLOCNO_MODE (a);
]






Re: [ping] 3 pending patches

2012-05-10 Thread Richard Henderson

On 05/08/2012 01:29 AM, Eric Botcazou wrote:

Fix debug info of nested inline functions:
   http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00161.html


I'll leave this one for Jason.


Emit variable as size attribute in debug info:
   http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00422.html

Implement static stack checking on IA-64:
   http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00452.html


Both ok.


r~


Re: [C++ Patch] PR 53301

2012-05-10 Thread Paolo Carlini

Hi,

Yes, please. It feels as if the names are based more on the underlying
implementation of the macro than on anything else. Also, short names
are nice, but using MEM instead of MEMBER is a bit too short. The same
for OB for object and others.

PTR_OR_PTRMEM sounds to me like pointer or pointer to member, which
sounds redundant since a pointer to member is a pointer already.

And there is also TYPE_PTRMEM_P and TYPE_PTR_TO_MEMBER_P. From the
names it is not clear what is the difference. This could be
TYPE_PTR_TO_DATA_MEMBER and TYPE_PTR_TO_ANY_MEMBER. The few extra
chars help a lot to clarify the meaning.

Let's see if we can do something *now* ;) My concrete proposal would be:

TYPE_PTRMEM_P rename to TYPE_PTRDATAMEM_P (consistent with 
TYPE_PTRMEMFUNC_P)

TYPE_PTR_TO_MEMBER_P rename to TYPE_PTRMEM_P

and then finally

#define TYPE_PTR_OR_PTRMEM_P(NODE) \
(TYPE_PTR_P (NODE) || TYPE_PTRMEM_P (NODE))

and use it everywhere. Sounds like an improvement?

Additionally, we could maybe rename PTRMEM_OK_P to PTRMEMFUNC_OK_P

Thanks,
Paolo.


Fix Mozilla LTO build

2012-05-10 Thread Jan Hubicka
Hi,
Mozilla LTO build broke due because symtab_remove_unreachable_nodes incorrectly 
removes origins of clones
in some special cases.

Bootstrapped/regtested x97_64-linux and comitted.
Index: ipa.c
===
--- ipa.c   (revision 187375)
+++ ipa.c   (working copy)
@@ -310,12 +310,12 @@ symtab_remove_unreachable_nodes (bool be
 
  /* For non-inline clones, force their origins to the boundary and 
ensure
 that body is not removed.  */
- while (cnode-clone_of  !cnode-clone_of-symbol.aux
+ while (cnode-clone_of
  !gimple_has_body_p (cnode-symbol.decl))
{
  bool noninline = cnode-clone_of-symbol.decl != 
cnode-symbol.decl;
  cnode = cnode-clone_of;
- if (noninline  !cnode-symbol.aux)
+ if (noninline)
{
  pointer_set_insert (body_needed_for_clonning, 
cnode-symbol.decl);
  enqueue_node ((symtab_node)cnode, first, reachable);


Speed up inliner

2012-05-10 Thread Jan Hubicka
Hi,
this patch cuts 10 minutes of Mozilla compilation time that is spent by 
updating keys.
After Richi's removal of overall growth from the cost functions, we no longer 
need to
update that much.

Bootstrapped/regtested x86_64-linux and tested on Mozilla build. Comitted.

Honza

* ipa-inline.c (update_all_callee_keys): Remove.
(inline_small_functions): Simplify priority updating.
Index: ipa-inline.c
===
--- ipa-inline.c(revision 187375)
+++ ipa-inline.c(working copy)
@@ -1097,45 +1097,6 @@ update_callee_keys (fibheap_t heap, stru
   }
 }
 
-/* Recompute heap nodes for each of caller edges of each of callees.
-   Walk recursively into all inline clones.  */
-
-static void
-update_all_callee_keys (fibheap_t heap, struct cgraph_node *node,
-   bitmap updated_nodes)
-{
-  struct cgraph_edge *e = node-callees;
-  if (!e)
-return;
-  while (true)
-if (!e-inline_failed  e-callee-callees)
-  e = e-callee-callees;
-else
-  {
-   struct cgraph_node *callee = cgraph_function_or_thunk_node (e-callee,
-   NULL);
-
-   /* We inlined and thus callees might have different number of calls.
-  Reset their caches  */
-reset_node_growth_cache (callee);
-   if (e-inline_failed)
- update_caller_keys (heap, callee, updated_nodes, e);
-   if (e-next_callee)
- e = e-next_callee;
-   else
- {
-   do
- {
-   if (e-caller == node)
- return;
-   e = e-caller-callers;
- }
-   while (!e-next_callee);
-   e = e-next_callee;
- }
-  }
-}
-
 /* Enqueue all recursive calls from NODE into priority queue depending on
how likely we want to recursively inline the call.  */
 
@@ -1488,7 +1449,7 @@ inline_small_functions (void)
 at once. Consequently we need to update all callee keys.  */
  if (flag_indirect_inlining)
add_new_edges_to_heap (heap, new_indirect_edges);
-  update_all_callee_keys (heap, where, updated_nodes);
+  update_callee_keys (heap, where, updated_nodes);
}
   else
{
@@ -1527,18 +1488,7 @@ inline_small_functions (void)
  reset_edge_caches (edge-callee);
   reset_node_growth_cache (callee);
 
- /* We inlined last offline copy to the body.  This might lead
-to callees of function having fewer call sites and thus they
-may need updating. 
-
-FIXME: the callee size could also shrink because more information
-is propagated from caller.  We don't track when this happen and
-thus we need to recompute everything all the time.  Once this is
-solved, || 1 should go away.  */
- if (callee-global.inlined_to || 1)
-   update_all_callee_keys (heap, callee, updated_nodes);
- else
-   update_callee_keys (heap, edge-callee, updated_nodes);
+ update_callee_keys (heap, edge-callee, updated_nodes);
}
   where = edge-caller;
   if (where-global.inlined_to)
@@ -1551,11 +1501,6 @@ inline_small_functions (void)
 called by function we inlined (since number of it inlinable callers
 might change).  */
   update_caller_keys (heap, where, updated_nodes, NULL);
-
-  /* We removed one call of the function we just inlined.  If offline
-copy is still needed, be sure to update the keys.  */
-  if (callee != where  !callee-global.inlined_to)
-update_caller_keys (heap, callee, updated_nodes, NULL);
   bitmap_clear (updated_nodes);
 
   if (dump_file)


[PATCH, i386]: Avodi movaps size optimizations for TARGET_AVX

2012-05-10 Thread Uros Bizjak
Hello!

There is no point to emit vmovaps instead of vmovapd or vmovdqa, these
instructions have same sizes. Attached patch fixes this oversight for
TARGET_AVX.

2012-05-11  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (*movti_internal_rex64): Avoid MOVAPS size
optimization for TARGET_AVX.
(*movti_internal_sse): Ditto.
(*movdi_internal_rex64): Handle TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL.
(*movdi_internal): Ditto.
(*movsi_internal): Ditto.
(*movtf_internal): Avoid MOVAPS size optimization for TARGET_AVX.
(*movdf_internal_rex64): Ditto.
(*movfd_internal): Ditto.
(*movsf_internal): Ditto.
* config/i386/sse.md (movmode): Handle TARGET_SSE_LOAD0_BY_PXOR.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 187372)
+++ i386.md (working copy)
@@ -1890,12 +1890,15 @@
(set (attr mode)
(cond [(eq_attr alternative 0,1)
 (const_string DI)
-  (ior (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   (match_test optimize_function_for_size_p (cfun)))
+  (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
 (const_string V4SF)
   (and (eq_attr alternative 4)
(match_test TARGET_SSE_TYPELESS_STORES))
 (const_string V4SF)
+  (match_test TARGET_AVX)
+(const_string TI)
+  (match_test optimize_function_for_size_p (cfun))
+(const_string V4SF)
   ]
   (const_string TI)))])
 
@@ -1943,13 +1946,15 @@
   [(set_attr type sselog1,ssemov,ssemov)
(set_attr prefix maybe_vex)
(set (attr mode)
-   (cond [(ior (match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
-   (match_test optimize_function_for_size_p (cfun)))
+   (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
 (const_string V4SF)
   (and (eq_attr alternative 2)
(match_test TARGET_SSE_TYPELESS_STORES))
 (const_string V4SF)
-  (not (match_test TARGET_SSE2))
+  (match_test TARGET_AVX)
+(const_string TI)
+  (ior (not (match_test TARGET_SSE2))
+   (match_test optimize_function_for_size_p (cfun)))
 (const_string V4SF)
  ]
  (const_string TI)))])
@@ -1970,8 +1975,11 @@
return movdq2q\t{%1, %0|%0, %1};
 
 case TYPE_SSEMOV:
-  if (get_attr_mode (insn) == MODE_TI)
+  if (get_attr_mode (insn) == MODE_V4SF)
+   return %vmovaps\t{%1, %0|%0, %1};
+  else if (get_attr_mode (insn) == MODE_TI)
return %vmovdqa\t{%1, %0|%0, %1};
+
   /* Handle broken assemblers that require movd instead of movq.  */
   if (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1]))
return %vmovd\t{%1, %0|%0, %1};
@@ -2048,7 +2056,20 @@
  (if_then_else (eq_attr alternative 10,11,12,13,14,15)
(const_string maybe_vex)
(const_string orig)))
-   (set_attr mode SI,DI,DI,DI,SI,DI,DI,DI,DI,DI,TI,DI,TI,DI,DI,DI,DI,DI)])
+   (set (attr mode)
+   (cond [(eq_attr alternative 0,4)
+ (const_string SI)
+  (eq_attr alternative 10,12)
+ (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
+  (const_string V4SF)
+(match_test TARGET_AVX)
+  (const_string TI)
+(match_test optimize_function_for_size_p (cfun))
+  (const_string V4SF)
+   ]
+   (const_string TI))
+ ]
+ (const_string DI)))])
 
 ;; Reload patterns to support multi-word load/store
 ;; with non-offsetable address.
@@ -2142,7 +2163,7 @@
case MODE_DI:
   return %vmovq\t{%1, %0|%0, %1};
case MODE_V4SF:
- return movaps\t{%1, %0|%0, %1};
+ return %vmovaps\t{%1, %0|%0, %1};
case MODE_V2SF:
  return movlps\t{%1, %0|%0, %1};
default:
@@ -2189,7 +2210,22 @@
  (if_then_else (eq_attr alternative 5,6,7,8)
(const_string maybe_vex)
(const_string orig)))
-   (set_attr mode DI,DI,DI,DI,DI,TI,DI,TI,DI,V4SF,V2SF,V4SF,V2SF,DI,DI)])
+   (set (attr mode)
+   (cond [(eq_attr alternative 9,11)
+ (const_string V4SF)
+  (eq_attr alternative 10,12)
+ (const_string V2SF)
+  (eq_attr alternative 5,7)
+ (cond [(match_test TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
+  (const_string V4SF)
+(match_test TARGET_AVX)
+  (const_string TI)
+(match_test optimize_function_for_size_p (cfun))
+  

[C++ Patch] PR 53305

2012-05-10 Thread Paolo Carlini

Hi,

an ICE on invalid (per Daniel's analysis): when r is NULL_TREE the next 
DECL_CONTEXT (r) can only crash. Plus a garbled error message because 
pp_cxx_simple_type_specifier doesn't handle BOUND_TEMPLATE_TEMPLATE_PARM.


Tested x86_64-linux.

Thanks,
Paolo.

///
/cp
2012-05-11  Paolo Carlini  paolo.carl...@oracle.com

PR c++/53305
* pt.c (tsubst_copy: case PARM_DECL): Return error_mark_node if
tsubst_decl returns NULL_TREE.
* cxx-pretty-print.c (pp_cxx_simple_type_specifier): Handle
BOUND_TEMPLATE_TEMPLATE_PARM.

/testsuite
2012-05-11  Paolo Carlini  paolo.carl...@oracle.com

PR c++/53305
* g++.dg/cpp0x/variadic132.C: New.


Index: testsuite/g++.dg/cpp0x/variadic132.C
===
--- testsuite/g++.dg/cpp0x/variadic132.C(revision 0)
+++ testsuite/g++.dg/cpp0x/variadic132.C(revision 0)
@@ -0,0 +1,27 @@
+// PR c++/53305
+// { dg-do compile { target c++11 } }
+
+templateclass... Ts struct tuple { };
+
+struct funct
+{
+  templateclass... argTs
+  int operator()(argTs...);
+};
+
+templateclass... class test;
+
+templatetemplate class... class tp,
+class... arg1Ts, class... arg2Ts
+class testtparg1Ts..., tparg2Ts...
+{
+  templateclass func, class...arg3Ts
+auto test2(func fun, arg1Ts... arg1s, arg3Ts... arg3s)
+- decltype(fun(arg1s..., arg3s...));
+};
+
+int main()
+{
+  testtuple, tuplechar,int t2;
+  t2.test2(funct(), 'a', 2);  // { dg-error no matching function }
+}
Index: cp/cxx-pretty-print.c
===
--- cp/cxx-pretty-print.c   (revision 187376)
+++ cp/cxx-pretty-print.c   (working copy)
@@ -1261,6 +1261,7 @@ pp_cxx_simple_type_specifier (cxx_pretty_printer *
 case TEMPLATE_TYPE_PARM:
 case TEMPLATE_TEMPLATE_PARM:
 case TEMPLATE_PARM_INDEX:
+case BOUND_TEMPLATE_TEMPLATE_PARM:
   pp_cxx_unqualified_id (pp, t);
   break;
 
Index: cp/pt.c
===
--- cp/pt.c (revision 187376)
+++ cp/pt.c (working copy)
@@ -12084,6 +12084,8 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com
 not the following PARM_DECLs that are chained to T.  */
  c = copy_node (t);
  r = tsubst_decl (c, args, complain);
+ if (!r)
+   return error_mark_node;
  /* Give it the template pattern as its context; its true context
 hasn't been instantiated yet and this is good enough for
 mangling.  */


[PATCH, alpha]: Fix ICE in alpha_emit_conditional_move, at config/alpha/alpha.c:2649

2012-05-10 Thread Uros Bizjak
Hello!

Recently testsuite/gcc.c-torture/execute/ieee/pr50310.c started to ICE
when compiled with -O3 -mieee on alphaev68-pc-linux-gnu:

$ ~/gcc-build-alpha/gcc/cc1 -O3 -mieee -quiet pr50310.c
pr50310.c: In function ‘foo’:
pr50310.c:31:20: internal compiler error: in
alpha_emit_conditional_move, at config/alpha/alpha.c:2649
 s3[10 * 4 + i] = __builtin_isunordered (s1[i], s2[i]) ? -1.0 : 0.0;
^
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.

It turned out that UNORDERED and ORDERED RTX codes are not handled in
alpha_emit_conditional_move. Attached patch fixes this oversight.

2012-05-11  Uros Bizjak  ubiz...@gmail.com

* config/alpha/alpha.c (alpha_emit_conditional_branch): Handle
ORDERED and UNORDERED conditions.

Patch was bootstrapped and regression tested on
alphaev68-pc-linux-gnu.  OK for mainline SVN and release branches?

Uros.
Index: config/alpha/alpha.c
===
--- config/alpha/alpha.c(revision 187371)
+++ config/alpha/alpha.c(working copy)
@@ -2335,7 +2335,7 @@ alpha_emit_conditional_branch (rtx operands[], enu
 {
 case EQ:  case LE:  case LT:  case LEU:  case LTU:
 case UNORDERED:
-  /* We have these compares: */
+  /* We have these compares.  */
   cmp_code = code, branch_code = NE;
   break;
 
@@ -2572,13 +2572,15 @@ alpha_emit_conditional_move (rtx cmp, enum machine
   switch (code)
{
case EQ: case LE: case LT: case LEU: case LTU:
+   case UNORDERED:
  /* We have these compares.  */
  cmp_code = code, code = NE;
  break;
 
case NE:
- /* This must be reversed.  */
- cmp_code = EQ, code = EQ;
+   case ORDERED:
+ /* These must be reversed.  */
+ cmp_code = reverse_condition (code), code = EQ;
  break;
 
case GE: case GT: case GEU: case GTU:
@@ -2627,11 +2629,13 @@ alpha_emit_conditional_move (rtx cmp, enum machine
   switch (code)
 {
 case EQ:  case LE:  case LT:  case LEU:  case LTU:
+case UNORDERED:
   /* We have these compares: */
   break;
 
 case NE:
-  /* This must be reversed.  */
+case ORDERED:
+  /* These must be reversed.  */
   code = reverse_condition (code);
   cmov_code = EQ;
   break;


Re: [MIPS] Fix misspelled macro in t-vxworks

2012-05-10 Thread Mingjie Xing
2012/5/11 Richard Sandiford rdsandif...@googlemail.com:
 Mingjie Xing mingjie.x...@gmail.com writes:
 This patch fix the misspelled macro in t-vxworks.  Is it OK?

 2012-05-10  Mingjie Xing  mingjie.x...@gmail.com

         * config/mips/t-vxworks: Change MUTLILIB_EXTRA_OPTS to
         MULTILIB_EXTRA_OPTS.

 OK, thanks.

 Richard

Committed revision 187392.

Regards,
Mingjie


Re: [C++ Patch] PR 53305

2012-05-10 Thread Gabriel Dos Reis
On Thu, May 10, 2012 at 6:40 PM, Paolo Carlini paolo.carl...@oracle.com wrote:
 Hi,

 an ICE on invalid (per Daniel's analysis): when r is NULL_TREE the next
 DECL_CONTEXT (r) can only crash. Plus a garbled error message because
 pp_cxx_simple_type_specifier doesn't handle BOUND_TEMPLATE_TEMPLATE_PARM.

 Tested x86_64-linux.

 Thanks,
 Paolo.

 ///

Stylistically, I would write

   if (r == NULL)

or

  if (r == NULL_TREE)

Patch OK with that change.

-- Gaby


PATCH: Add RTM support to -march=native

2012-05-10 Thread H.J. Lu
Hi,

This patch adds RTM support to -march=native.  Tested on Linux/x86-64.
OK for trunk?

Thanks.

H.J.
---
2012-05-10  H.J. Lu  hongjiu...@intel.com

* config/i386/driver-i386.c (host_detect_local_cpu): Support
RTM.

diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index 8fe7ab8..e93e8d9 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -397,7 +397,7 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
   unsigned int has_pclmul = 0, has_abm = 0, has_lwp = 0;
   unsigned int has_fma = 0, has_fma4 = 0, has_xop = 0;
   unsigned int has_bmi = 0, has_bmi2 = 0, has_tbm = 0, has_lzcnt = 0;
-  unsigned int has_hle = 0;
+  unsigned int has_hle = 0, has_rtm = 0;
 
   bool arch;
 
@@ -458,6 +458,7 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
 
   has_bmi = ebx  bit_BMI;
   has_hle = ebx  bit_HLE;
+  has_rtm = ebx  bit_RTM;
   has_avx2 = ebx  bit_AVX2;
   has_bmi2 = ebx  bit_BMI2;
 }
@@ -731,10 +732,11 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
   const char *sse4_1 = has_sse4_1 ?  -msse4.1 :  -mno-sse4.1;
   const char *lzcnt = has_lzcnt ?  -mlzcnt :  -mno-lzcnt;
   const char *hle = has_hle ?  -mhle :  -mno-hle;
+  const char *rtm = has_rtm ?  -mrtm :  -mno-rtm;
 
   options = concat (options, cx16, sahf, movbe, ase, pclmul,
popcnt, abm, lwp, fma, fma4, xop, bmi, bmi2,
-   tbm, avx, avx2, sse4_2, sse4_1, lzcnt,
+   tbm, avx, avx2, sse4_2, sse4_1, lzcnt, rtm,
hle, NULL);
 }
 


Re: [h8300] increase dwarf address size

2012-05-10 Thread Jeff Law

On 05/10/2012 11:21 AM, DJ Delorie wrote:

Regardless, shouldn't DWARF2_ADDR_SIZE be POINTER_SIZE / BITS_PER_UNIT?


That's the default.  It doesn't work because pointers are still 16 bits.
Something's still not right then.  The H8/300 has 16 bit pointers and a 
64k address space and all the processors in the family still support 
that mode.



Jeff



Re: [h8300] increase dwarf address size

2012-05-10 Thread DJ Delorie

  That's the default.  It doesn't work because pointers are still 16 bits.

 Something's still not right then.  The H8/300 has 16 bit pointers and a 
 64k address space and all the processors in the family still support 
 that mode.

The problem is when a single object is more than 64k for the models
that have 16 bit pointers.  No, they won't work on hardware, but we
can't even *compile* them because of the dwarf limitation, which means
all the other models can't support such large objects.

I.e. the 2-byte dwarf addresses for H8/300 prevent us from supporting
C++ on the larger H8/300SX processors.


Re: [h8300] increase dwarf address size

2012-05-10 Thread Jeff Law

On 05/10/2012 09:55 PM, DJ Delorie wrote:

That's the default.  It doesn't work because pointers are still 16 bits.


Something's still not right then.  The H8/300 has 16 bit pointers and a
64k address space and all the processors in the family still support
that mode.


The problem is when a single object is more than 64k for the models
that have 16 bit pointers.  No, they won't work on hardware, but we
can't even *compile* them because of the dwarf limitation, which means
all the other models can't support such large objects.

I.e. the 2-byte dwarf addresses for H8/300 prevent us from supporting
C++ on the larger H8/300SX processors.
Right, so ISTM the way to fix this is to not build the C++ runtime for 
normal mode, rather than hack up the dwarf address size.


Now I know there's currently no way to do that, but that seems to me to 
be the correct fix.  Unfortunately it's going to probably look a whole 
lot like multilib exceptions...  E.


Jeff



Re: [h8300] increase dwarf address size

2012-05-10 Thread DJ Delorie

Whereas making dwarf addresses always 32 bits only affects debugging
info size (not rom image size) on the oldest and smallest H8/300
variant, where real-world code would have a limited amount of debug
information anyway.


Re: [C++ Patch] PR 53301

2012-05-10 Thread Jason Merrill

On 05/10/2012 05:31 PM, Paolo Carlini wrote:

Let's see if we can do something *now* ;) My concrete proposal would be:

TYPE_PTRMEM_P rename to TYPE_PTRDATAMEM_P (consistent with
TYPE_PTRMEMFUNC_P)
TYPE_PTR_TO_MEMBER_P rename to TYPE_PTRMEM_P

and then finally

#define TYPE_PTR_OR_PTRMEM_P(NODE) \
(TYPE_PTR_P (NODE) || TYPE_PTRMEM_P (NODE))

and use it everywhere. Sounds like an improvement?


Sounds pretty good.  But I suspect a lot of places want to check 
TYPE_PTR_P || TYPE_PTRDATAMEM_P (because you can't just use TREE_TYPE to 
get the function type of a PMF), so this new macro doesn't help with 
your desire to avoid writing ||.



Additionally, we could maybe rename PTRMEM_OK_P to PTRMEMFUNC_OK_P


No, that flag applies to both varieties of members.

Jason


Re: [patch] support for multiarch systems

2012-05-10 Thread Matthias Klose
On 10.05.2012 08:42, Paolo Bonzini wrote:
 Il 09/05/2012 19:19, Matthias Klose ha scritto:
 these are referenced from the http://wiki.debian.org/Multiarch/Tuples
 https://wiki.ubuntu.com/MultiarchSpec#Filesystem_layout
 http://err.no/debian/amd64-multiarch-3

 http://wiki.debian.org/Multiarch/TheCaseForMultiarch describes use cases for
 multiarch, and why Debian thinks that the existing approaches are not 
 sufficient
 (having name collisions for different architectures or ad hoc names for new
 architectures like libx32).  That may be contentious within the Linux 
 community,
 but I would like to avoid this kind of discussion here.
 
 I don't care about contentiousness, I just would like this to be
 documented somewhere (for example in the internals manual where
 MULTILIB_* is documented too).

ok, I did clarify it in the existing documentation of MULTIARCH_DIRNAME in
fragments.texi, detailing the search order for the files. Should the search
order be mentioned in some user documentation as well? if yes, where?

  Matthias

Index: doc/fragments.texi
===
--- doc/fragments.texi  (revision 187337)
+++ doc/fragments.texi  (working copy)
@@ -152,6 +152,52 @@
 of options to be used for all builds.  If you set this, you should
 probably set @code{CRTSTUFF_T_CFLAGS} to a dash followed by it.
 
+@findex MULTILIB_OSDIRNAMES
+@item MULTILIB_OSDIRNAMES
+If @code{MULTILIB_OPTIONS} is used, this variable specifies the list
+of OS subdirectory names.  The format is either the same as of
+@code{MULTILIB_DIRNAMES}, or a set of mappings.  When it is the same
+as @code{MULTILIB_DIRNAMES}, it describes the multilib directories
+using OS conventions, rather than GCC conventions.  When it is a set
+of mappings of the form @var{gccdir}=@var{osdir}, the left side gives
+the GCC convention and the right gives the equivalent OS defined
+location.  If the @var{osdir} part begins with a @samp{!}, the os
+directory names are used exclusively.  Use the mapping when there is
+no one-to-one equivalence between GCC levels and the OS.
+
+For multilib enabled configurations (see @code{MULTIARCH_DIRNAME})
+below), the multilib name is appended to each directory name, separated
+by a colon (e.g. @samp{../lib:x86_64-linux-gnu}).
+
+@findex MULTIARCH_DIRNAME
+@item MULTIARCH_DIRNAME
+If @code{MULTIARCH_DIRNAME} is used, this variable specifies the
+multiarch name for this configuration.  For multiarch enabled
+configurations it is used to search libraries, crt files and system
+header files in additional locations.
+
+Libraries and crt files are searched first in
+@var{prefix}/@var{multiarch} before @var{prefix} for each @var{prefix}
+added by @code{add_prefix} or @code{add_sysrooted_prefix}.
+System header files are searched first in
+@code{LOCAL_INCLUDE_DIR}/@var{multiarch} before
+@code{LOCAL_INCLUDE_DIR}, and in
+@code{NATIVE_SYSTEM_HEADER_DIR}/@var{multiarch} before
+@code{NATIVE_SYSTEM_HEADER_DIR}.
+
+E.g. for a multiarch enabled system compiler
+@file{/lib/@var{multiarch}} is searched before @file{/lib} and
+@file{/usr/lib/@var{multiarch}} before @file{/usr/lib}, and system
+header files are searched in @file{/usr/local/include/@var{multiarch}}
+before @file{/usr/local/include} and in
+@file{/usr/include/@var{multiarch}} before @file{/usr/include}.
+
+@code{MULTIARCH_DIRNAME} is not used for multilib enabled
+configurations, but encoded in @code{MULTILIB_OSDIRNAMES} instead.
+
+The multiarch tuples are defined
+in @uref{http://wiki.debian.org/Multiarch/Tuples}.
+
 @findex SPECS
 @item SPECS
 Unfortunately, setting @code{MULTILIB_EXTRA_OPTS} is not enough, since


Go patch committed: Remove incorrect ChangeLog entry

2012-05-10 Thread Ian Lance Taylor
As described in gcc/go/README.gcc, the files in gcc/go/gofrontend are
copied from http://code.google.com/p/gofrontend .  Changes should be
committed there first, and mirrored to the GCC repository.  Also,
changes committed to that repository are not listed in
gcc/go/ChangeLog.  A recent patch updated gogo-tree.c for the change in
the name of cgraph_finalize_compilation_unit.  I have applied that patch
to the gofrontend repository, and I have applied this patch to the GCC
repository to remove the incorrect ChangeLog entry.  Note that changes
to ChangeLog files do not themselves get ChangeLog entries.

Ian

Index: gcc/go/ChangeLog
===
--- gcc/go/ChangeLog	(revision 187393)
+++ gcc/go/ChangeLog	(working copy)
@@ -14,10 +14,6 @@
 	* gccgo.texi (Invoking gccgo): Document -fgo-pkgpath.  Update the
 	docs for -fgo-prefix.
 
-2012-04-30  Jan Hubicka  j...@suse.cz
-
-	* gogo-tree.cc (Gogo::write_globals): Use finalize_compilation_unit.
-
 2012-04-23  Ian Lance Taylor  i...@google.com
 
 	* go-lang.c (go_langhook_init): Set MPFR precision to 256.