Re: fold_builtin changes tree

2012-03-20 Thread Paulo J. Matos
On Mon, 19 Mar 2012 22:49:39 -0700, Ian Lance Taylor wrote:

> 
> I'm not sure what you are folding the builtin to, but perhaps you could
> retain a reference to the function.
> 

I am folding the function call __function_size(foobar) to a new symbol 
foobar@size. The reference to function foobar disappears. Can I keep a 
reference foobar attached to the symbol somehow?

> Or, you could write a tiny pass which set DECL_PRESERVED_P for each
> function passed to __function_size.
> 

Might do as a last resort.

> Or, perhaps you could handle it in expand_builtin rather than
> fold_builtin.
> 

I thought that the only way to replace a builtin expression was to use 
fold_builtin and expand_builtin had other purposes. How can I use the 
expand_builtin to do replacement?

Cheers,

Paulo Matos




-- 
PMatos



Re: Cloning functions

2012-03-20 Thread Martin Jambor
Hi,

On Tue, Mar 20, 2012 at 02:07:17PM +1100, Matt Davis wrote:
> Hello,
> In my transformation of an input program, I need to clone functions
> and the callee functions in each clone.  To clone a function, or
> create a duplicate, I use "cgraph_function_versioning()"  This works
> perfectly well for the parent function.  I then go through the
> statements in the parent and look for any function calls (callees).
> If I find a function call, I clone that function and update the call
> site using "gimple_call_set_fn()"  Now, when I dump the gimple via
> "debug_function()" I see everything as I expect (parent-clone calls
> all the callee-clones).  The parent and all of its callees are the
> clones I want.  However, when GCC finishes compiling things, the
> callee clones are no where to be found.  

And do you change the calls in the callers of the "parent function?"
This is exactly what you would see if you don't.  See convert_callers
and convert_callers_for_node in tree-sra.c (ipa_modify_call_arguments
is probably equivalent to gimple_call_set_fndecl for your purposes).

cgraph_function_versioning only updates the call graph edges, not the
associated statements which are key in non-IPA stages of compilation
(yes, that makes the redirect_callers parameter quite misleading and
semi-irrelevant but well...).

> And the original (non-clone)
> calleess are being used.  The parent-clone is there but all of the
> callsites are using the original callees and not the clones.  I know
> there must be some update routine, (rebuild_cgraph_edges() did not
> help) to glue the callee clones in place so that they do not revert
> back to the original callee.
> 
> I hope I haven't been too confusing, I do appreciate any help if possible.

I am confused by the term parent function.  Do you mean a parent of
nested functions or something else?  If so, at what stage of
compilation do you clone functions then?  I'm afraid that
cgraph_function_lowering was not written with un-lowered nested
functions in mind and is only usable once they are lowered (because I
just had a quick look and the relevant cgraph_node fields are not
dealt with).  But at the same time I think it is unlikely your pass
runs that early.

Martin


Re: fold_builtin changes tree

2012-03-20 Thread Jakub Jelinek
On Tue, Mar 20, 2012 at 10:21:45AM +, Paulo J. Matos wrote:
> > I'm not sure what you are folding the builtin to, but perhaps you could
> > retain a reference to the function.
> > 
> 
> I am folding the function call __function_size(foobar) to a new symbol 
> foobar@size. The reference to function foobar disappears. Can I keep a 
> reference foobar attached to the symbol somehow?

Folding it, at least too early, is definitely a bad idea, especially if you
set DECL_PRESERVED_P.  Because that will prevent the referenced function
from being removed, even if the function containing __function_size is
removed.  You really want to do it in the expander instead.

> I thought that the only way to replace a builtin expression was to use 
> fold_builtin and expand_builtin had other purposes. How can I use the 
> expand_builtin to do replacement?

Like any other builtin expander?  There are many dozens of examples in
builtins.c.  It is called with the tree argument, so you verify it, complain
if the argument is not the one you are expecting, and just expand it as the
symbol instead of expanding the call.  Basically you could do what you
currently do in the folder, and feed what you'd return from that to
expand_normal or expand_expr.

Jakub


Re: Reloading going wrong. Bug in GCC?

2012-03-20 Thread Mohamed Shafi
ping !!!. Any help on http://gcc.gnu.org/ml/gcc/2011-09/msg00150.html

shafi

On 14 September 2011 15:07, Mohamed Shafi  wrote:
> Hi,
>
> I am working on a 32bit private target which has the following restriction
>
> 1. store/load can happen only through a general purpose register (GP_REGS)
> 2. base register should be an address register (AD_REGS)
> 3. moves between GP_REGS and AD_REGS can happen only through PT_REGS
>
> In a PRE_MODIFY instruction when both the base register and the output
> register gets spilled the reloading is going wrong.
>
> befor IRA pass
> ~~~
> (insn 259 336 317 2 ../rld_bug.c:94 (set (reg:QI 234 [+1 ])
>        (mem/s/j/c:QI (pre_modify:PQI (reg/f:PQI 233)
>                (plus:PQI (reg/f:PQI 233)
>                    (const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op}
> (expr_list:REG_INC (reg/f:PQI 233)
>        (nil)))
>
> after IRA pass
> ~~~
> Reloads for insn # 259
> Reload 0: GP_REGS, RELOAD_FOR_OPADDR_ADDR (opnum = 1), can't combine,
> secondary_reload_p
>        reload_reg_rtx: (reg:PQI 11 g11)
> Reload 1: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
> combine, secondary_reload_p
>        reload_reg_rtx: (reg:PQI 12 as0)
>        secondary_in_reload = 0
> Reload 2: GP_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
> combine, secondary_reload_p
>        reload_reg_rtx: (reg:PQI 11 g11)
> Reload 3: PT_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), can't
> combine, secondary_reload_p
>        reload_reg_rtx: (reg:PQI 13 as1)
>        secondary_out_reload = 2
>
> Reload 4: reload_in (PQI) = (reg/f:PQI 233)
>        reload_out (PQI) = (reg/f:PQI 233)
>        AD_REGS, RELOAD_OTHER (opnum = 1)
>        reload_in_reg: (reg/f:PQI 233)
>        reload_out_reg: (reg/f:PQI 233)
>        reload_reg_rtx: (reg:PQI 31 a3)
>        secondary_in_reload = 1, secondary_out_reload = 3
>
> Reload 5: reload_out (QI) = (reg:QI 234 [+1 ])
>        GP_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
>        reload_out_reg: (reg:QI 234 [+1 ])
>        reload_reg_rtx: (reg:QI 11 g11)
>
>
> (insn 744 336 745 2 ../rld_bug.c:94 (set (reg:PQI 11 g11)
>        (mem/c:PQI (plus:PQI (reg/f:PQI 32 sp)
>                (const_int -24 [0xffe8])) [99 %sfp+8 S1
> A32])) 9 {movpqi_op} (nil))
>
> (insn 745 744 746 2 ../rld_bug.c:94 (set (reg:PQI 12 as0)
>        (reg:PQI 11 g11)) 9 {movpqi_op} (nil))
>
> (insn 746 745 259 2 ../rld_bug.c:94 (set (reg:PQI 31 a3)
>        (reg:PQI 12 as0)) 9 {movpqi_op} (nil))
>
> (insn 259 746 747 2 ../rld_bug.c:94 (set (reg:QI 11 g11)
>        (mem/s/j/c:QI (pre_modify:PQI (reg:PQI 31 a3)
>                (plus:PQI (reg:PQI 31 a3)
>                    (const_int 1 [0x1]))) [0+1 S1 A32])) 7 {movqi_op}
> (expr_list:REG_INC (reg:PQI 31 a3)
>        (nil)))
>
> (insn 747 259 748 2 ../rld_bug.c:94 (set (reg:PQI 13 as1)
>        (reg:PQI 31 a3)) 9 {movpqi_op} (nil))
>
> (insn 748 747 749 2 ../rld_bug.c:94 (set (reg:PQI 11 g11)
>        (reg:PQI 13 as1)) 9 {movpqi_op} (nil))
>
> (insn 749 748 750 2 ../rld_bug.c:94 (set (mem/c:PQI (plus:PQI (reg/f:PQI 32 
> sp)
>                (const_int -24 [0xffe8])) [99 %sfp+8 S1 A32])
>        (reg:PQI 11 g11)) 9 {movpqi_op} (nil))
>
> (insn 750 749 751 2 ../rld_bug.c:94 (set (mem/c:QI (plus:PQI (reg/f:PQI 32 sp)
>                (const_int -29 [0xffe3])) [99 %sfp+3 S1 A32])
>        (reg:QI 11 g11)) 7 {movqi_op} (nil))
>
>
> After IRA pass for insn 259 1st the modified address is stored into
> its spilled location and then the modified value is stored. As you can
> see from the instructions same register (g11) is used for Reload 5 and
> 2, and hence the modified value is getting corrupted and hence the
> modified address gets stored instead of modified value (insn 749 and
> insn 750). I am not able to figure out where this is going wrong in
> the reload phase. I suspect that this is a GCC issue.
>
> Can some one give me some pointers to resolve this issue?
>
> Regards,
> Shafi


Re: GCC 5? (was Re: GCC 4.7.0RC: Mangled names in cc1)

2012-03-20 Thread Ludovic Courtès
Hi Richard,

Richard Guenther  skribis:

> 2012/3/19 Ludovic Courtès :

[...]

>> In the example of name mangling, I’d just have wrapped in ‘extern "C"’
>> all the headers listed in ‘PLUGIN_HEADERS’ in gcc/Makefile.in.  The
>> rationale is that it simplifies plug-in maintenance, while not impeding
>> development work in 4.7.
>
> Well, that's _all_ headers.  Basically.

Well, these headers get installed, and they get installed to be actually
used, don’t they?  :-)

> And exactly the problem.  There will be never even API compatibility
> between major releases of GCC with the current plugin "API".

My experience is more encouraging: between 4.5 and 4.6, I was only hit
by a couple of tree.h declarations found in one and not the other.

When switching to 4.7, the main problem was mangled names, and all the
problems that making my code compilable with g++ entails.  Other issues
were the removal of the ‘built_in_decls’ array, and the new
‘affects_type_identity’ field of ‘attribute_spec’.

All this is summarized in the Autoconf macro I use [0]:

  dnl   build_call_expr_loc_array -- not in GCC 4.5.x; appears in 4.6
  dnl   build_call_expr_loc_vec   -- likewise
  dnl   build_array_ref   -- present but undeclared in 4.6.1
  dnl   build_zero_cst-- not in GCC 4.5.x; appears in 4.6
  dnl   builtin_decl_explicit -- new in 4.7, replaces `built_in_decls'
  dnl   .affects_type_identity-- new field in 4.7

Then again, my plug-in is relatively small, and uses a small part of GCC.
Plug-ins with a larger API footprint may have more problems, of course.

Thanks,
Ludo’.

[0] 
https://gforge.inria.fr/scm/viewvc.php/trunk/m4/gcc.m4?view=markup&root=starpu


Re: Cloning functions

2012-03-20 Thread Matt Davis
Hi Martin, thanks very much for the information!

On Tue, Mar 20, 2012 at 9:29 PM, Martin Jambor  wrote:
> Hi,
>
> On Tue, Mar 20, 2012 at 02:07:17PM +1100, Matt Davis wrote:
>> Hello,
>> In my transformation of an input program, I need to clone functions
>> and the callee functions in each clone.  To clone a function, or
>> create a duplicate, I use "cgraph_function_versioning()"  This works
>> perfectly well for the parent function.  I then go through the
>> statements in the parent and look for any function calls (callees).
>> If I find a function call, I clone that function and update the call
>> site using "gimple_call_set_fn()"  Now, when I dump the gimple via
>> "debug_function()" I see everything as I expect (parent-clone calls
>> all the callee-clones).  The parent and all of its callees are the
>> clones I want.  However, when GCC finishes compiling things, the
>> callee clones are no where to be found.
>
> And do you change the calls in the callers of the "parent function?"
> This is exactly what you would see if you don't.  See convert_callers
> and convert_callers_for_node in tree-sra.c (ipa_modify_call_arguments
> is probably equivalent to gimple_call_set_fndecl for your purposes).

Actually, I do change the calls in the parent function.  What I had to
do was set the 'cfun' to the parent function, and then run
'rebuild_cgraph_edges()' and 'cleanup_tree_cfg()'

> cgraph_function_versioning only updates the call graph edges, not the
> associated statements which are key in non-IPA stages of compilation
> (yes, that makes the redirect_callers parameter quite misleading and
> semi-irrelevant but well...).
>
>> And the original (non-clone)
>> calleess are being used.  The parent-clone is there but all of the
>> callsites are using the original callees and not the clones.  I know
>> there must be some update routine, (rebuild_cgraph_edges() did not
>> help) to glue the callee clones in place so that they do not revert
>> back to the original callee.
>>
>> I hope I haven't been too confusing, I do appreciate any help if possible.
>
> I am confused by the term parent function.  Do you mean a parent of
> nested functions or something else?

Yep, you got it.

> If so, at what stage of
> compilation do you clone functions then?  I'm afraid that
> cgraph_function_lowering was not written with un-lowered nested
> functions in mind and is only usable once they are lowered (because I
> just had a quick look and the relevant cgraph_node fields are not
> dealt with).  But at the same time I think it is unlikely your pass
> runs that early.

Yes, my pass is really late, after all IPA passes have complete.  Once
again, thank you for your insight!

-Matt


Re: GCC 5? (was Re: GCC 4.7.0RC: Mangled names in cc1)

2012-03-20 Thread Richard Guenther
On Tue, Mar 20, 2012 at 12:47 PM, Ludovic Courtès
 wrote:
> Hi Richard,
>
> Richard Guenther  skribis:
>
>> 2012/3/19 Ludovic Courtès :
>
> [...]
>
>>> In the example of name mangling, I’d just have wrapped in ‘extern "C"’
>>> all the headers listed in ‘PLUGIN_HEADERS’ in gcc/Makefile.in.  The
>>> rationale is that it simplifies plug-in maintenance, while not impeding
>>> development work in 4.7.
>>
>> Well, that's _all_ headers.  Basically.
>
> Well, these headers get installed, and they get installed to be actually
> used, don’t they?  :-)
>
>> And exactly the problem.  There will be never even API compatibility
>> between major releases of GCC with the current plugin "API".
>
> My experience is more encouraging: between 4.5 and 4.6, I was only hit
> by a couple of tree.h declarations found in one and not the other.
>
> When switching to 4.7, the main problem was mangled names, and all the
> problems that making my code compilable with g++ entails.  Other issues
> were the removal of the ‘built_in_decls’ array, and the new
> ‘affects_type_identity’ field of ‘attribute_spec’.
>
> All this is summarized in the Autoconf macro I use [0]:
>
>  dnl   build_call_expr_loc_array -- not in GCC 4.5.x; appears in 4.6
>  dnl   build_call_expr_loc_vec   -- likewise
>  dnl   build_array_ref           -- present but undeclared in 4.6.1
>  dnl   build_zero_cst            -- not in GCC 4.5.x; appears in 4.6
>  dnl   builtin_decl_explicit     -- new in 4.7, replaces `built_in_decls'
>  dnl   .affects_type_identity    -- new field in 4.7
>
> Then again, my plug-in is relatively small, and uses a small part of GCC.
> Plug-ins with a larger API footprint may have more problems, of course.

I think it would be nice if you guys (plug-in makers) document what part
of the (non-)API you are using currently.  Document it on the GCC wiki
for example.  This way providing an initial guess for a real C plugin API
would be easier (and we'd get testing coverage).  I would even allow
such API be backported to the release branch(es) (given volunteers
to backport it).

We need to get started at some point - otherwise it will be just repeating
discussions.

If you have a copyright assignment on file (yeah, I guess even a set
of functions that just wrap existing gimple needs that) you might even
start at implementing such interface.  It might turn out as a convenient
library for plugin developers first.

Thanks,
Richard.

> Thanks,
> Ludo’.
>
> [0] 
> https://gforge.inria.fr/scm/viewvc.php/trunk/m4/gcc.m4?view=markup&root=starpu


Re: pr52543

2012-03-20 Thread Kenneth Zadeck

Ian is certainly correct.

I think that the question is really bigger than finding the correct line 
to fix.   The problem is, that this code assumes that machines do not 
have multiword moves or multiword shifts.   My machine has both, and i 
assume that the avr and the neon have at least multiword moves (but i do 
not know about the shifts).   And as life moves forward, more machines 
will have these.


It seems like the right way to fix this is to somehow enhance the code 
at the beginning of decompose_multiword_subregs to ask which modes are 
not cheap to move or shift and then modify the second loop to never 
lower for those operations for those modes.


The question is do i add 2 more target hooks (one for shifting and one 
for moves) or do i use the rtx_cost mechanism and split for anything 
over COSTS_N_INSNS (1) or some such?


Kenny

On 03/20/2012 01:13 AM, Ian Lance Taylor wrote:

Kenneth Zadeck  writes:


I have figured out what the root cause of pr52543, but i need some
advise as to how to fix it.
The bug only happens if the source or destination of the move is a
hard register.   lower-subreg never breaks up pseudo to pseudo moves
that are larger than word mode.   According to richard sandiford, this
bug also appears on the neon, but i do not know if there is a bugzilla
for it.   It also appears on my private port, which is why i am
interested in it.

in the particular case of pr52543 and my port, this happens because
the input arguments are hard regs.

The offending code is in can_decompose_p.   The problem is that if the
reg is a hard reg, it completely blows off all of the information that
it accumulated during the first pass and unconditionally splits the
register (assuming it is legal to do so).

My question for the list, what is the predicate that we want to
replace the code that always  decomposes hardregs (assuming it is
legal).In the case of the neon and my port, decomposing costs 4x
more than using a wide move.   I assume the avr is similar.

I don't think can_decompose_p would be the right thing to change.  If
that function returns false, resolve_simple_move is still going to split
up the move.  You need to change resolve_simple_move.  In fact, looking
at resolve_simple_move, I don't think it will break up the register
unless it has already decided to do so:

   /* If we didn't have any big SUBREGS of decomposed registers, and
  neither side of the move is a register we are decomposing, then
  we don't have to do anything here.  */

   if (src == SET_SRC (set)
   &&  dest == SET_DEST (set)
   &&  !resolve_reg_p (src)
   &&  !resolve_subreg_p (src)
   &&  !resolve_reg_p (dest)
   &&  !resolve_subreg_p (dest))
 {
   end_sequence ();
   return insn;
 }

So I think you need to analyze this a bit more.  I don't think that is
the offending code.

Ian


Re: PRE_GCC3_DWARF_FRAME_REGISTERS

2012-03-20 Thread Aldy Hernandez

On 03/19/12 12:28, David Edelsohn wrote:

On Wed, Mar 14, 2012 at 10:08 AM, Steven Bosscher  wrote:

Hello,

The rs6000 and cr16 backends and unwinding code have a define for the
DWARF frame register for pre-GCC3 compatibility
(PRE_GCC3_DWARF_FRAME_REGISTERS):

gcc/doc/tm.texi.in:@defmac PRE_GCC3_DWARF_FRAME_REGISTERS
gcc/doc/tm.texi:@defmac PRE_GCC3_DWARF_FRAME_REGISTERS
gcc/config/rs6000/rs6000.h:#define PRE_GCC3_DWARF_FRAME_REGISTERS 77



Is this compatibility still needed for rs6000, or can all the
PRE_GCC3_DWARF_FRAME_REGISTERS stuff be cleaned up?


I do not see any reason that compatibility with pre-GCC3 still should
be necessary.  The code was added by RTH and Aldy in 2001, so I want
to make sure they are not aware of any remaining dependency.

Thanks, David


I'm not aware of any.


Re: pr52543

2012-03-20 Thread Ian Lance Taylor
Kenneth Zadeck  writes:

> I think that the question is really bigger than finding the correct
> line to fix.   The problem is, that this code assumes that machines do
> not have multiword moves or multiword shifts.   My machine has both,
> and i assume that the avr and the neon have at least multiword moves
> (but i do not know about the shifts).   And as life moves forward,
> more machines will have these.
>
> It seems like the right way to fix this is to somehow enhance the code
> at the beginning of decompose_multiword_subregs to ask which modes are
> not cheap to move or shift and then modify the second loop to never
> lower for those operations for those modes.
>
> The question is do i add 2 more target hooks (one for shifting and one
> for moves) or do i use the rtx_cost mechanism and split for anything
> over COSTS_N_INSNS (1) or some such?

Why not use REGISTER_MOVE_COST?  You only care about hard registers, and
all that matters are moves between hard registers and pseudo-regs.  So
if you find a hard register REG, and if

  register_move_cost (GET_MODE (reg), REGNO_REG_CLASS (REGNO (REG)),
  REGNO_REG_CLASS (REGNO (REG)))
== 2

then put the pseudo reg into non_decomposable_context.  This would be in
find_decomposable_subregs.

Ian


Re: pr52543

2012-03-20 Thread Kenneth Zadeck
i actually care about all registers, not just the hard ones.as it 
turns out i had been wrong and lower-subregs splits pseudo to pseudo 
moves, and hard reg to and from psuedo moves.


register_move_cost requires the regclasses.

anyway that is not the right thing to do for the shifts.

kenny

On 03/20/2012 09:40 AM, Ian Lance Taylor wrote:

Kenneth Zadeck  writes:


I think that the question is really bigger than finding the correct
line to fix.   The problem is, that this code assumes that machines do
not have multiword moves or multiword shifts.   My machine has both,
and i assume that the avr and the neon have at least multiword moves
(but i do not know about the shifts).   And as life moves forward,
more machines will have these.

It seems like the right way to fix this is to somehow enhance the code
at the beginning of decompose_multiword_subregs to ask which modes are
not cheap to move or shift and then modify the second loop to never
lower for those operations for those modes.

The question is do i add 2 more target hooks (one for shifting and one
for moves) or do i use the rtx_cost mechanism and split for anything
over COSTS_N_INSNS (1) or some such?

Why not use REGISTER_MOVE_COST?  You only care about hard registers, and
all that matters are moves between hard registers and pseudo-regs.  So
if you find a hard register REG, and if

   register_move_cost (GET_MODE (reg), REGNO_REG_CLASS (REGNO (REG)),
   REGNO_REG_CLASS (REGNO (REG)))
 == 2

then put the pseudo reg into non_decomposable_context.  This would be in
find_decomposable_subregs.

Ian


Re: pr52543

2012-03-20 Thread Ian Lance Taylor
Kenneth Zadeck  writes:

> i actually care about all registers, not just the hard ones.as it
> turns out i had been wrong and lower-subregs splits pseudo to pseudo
> moves, and hard reg to and from psuedo moves.
>
> register_move_cost requires the regclasses.
>
> anyway that is not the right thing to do for the shifts.

I suppose you could compare costs.

Two new target hooks is not impossible, but the information ought to be
available in other ways.

Ian


Re: subreg:HI of PSI HW register issue

2012-03-20 Thread Aurelien Buhrig
09/03/2012 17:10, Bernd Schmidt:
> On 03/09/2012 04:20 PM, Aurelien Buhrig wrote:
>> I'm not used to work at tree level for now and it is unclear for me what
>> part of the code should be tweaked. Can you tell me which part of the
>> code you are fixing/looking at, so that I can have a better
>> understanding of ptr_mode vs Pmode before your fix?
> 
> I'm thinking the bitfield code in expmed.c (extract_bit_field_1 etc.)
> needs a subreg_offset_representable_p check, followed by looking for the
> next larger integer mode if that fails. Then, if necessary, cast the
> result back to the original mode in case of extraction.
> 
> 
> Bernd


I'm working on a fix for this issue in store_bit_field_1, but it is not
possible to know by advance if a pseudo register in PSI mode will be
reloaded into a register which will have all its word_mode subregs
representable.

So if the problem is not addressed during reload, it seems the only
solution is to force both extract and store operations on pointers into
integer mode if Pmode is a MODE_PARTIAL_INTEGER.

I did a quick fix for my target for the store operations in
store_bit_field_1, but I wonder if this is the right place to fix it
(e.g when defining the assignment type in the tree?).
Any advice?


There is another point which makes me confused about ptr_mode. I thought
ptr_mode was the integer mode which would contain Pmode (So SImode if
Pmode=PSI). But it is defined in the same class than Pmode
(MODE_PARTIAL_INTEGER here). So for my target, the ptr_mode is the same
as Pmode. So why making a difference between Pmode and ptr_mode ?



Aurélien



Re: A problem related to const rvalue

2012-03-20 Thread Jonathan Wakely
2012/3/20  :
> Is it a bug or by design? Who can answer the question for me?

This list is for discussing the development of GCC not for help using
it, so this is the wrong mailing list for your question. It would be
more appropriate on the gcc-help mailing list, please take an
follow-up there, thanks.

I believe G++ is correct, the relevant text is in [basic.lval] where
the standard says that "Class prvalues can have cv-qualified types;
non-class prvalues always have cv-unqualified types."  The result of
the function f2() is a prvalue, so has type int, with no
const-qualification.

If you repeat the experiment with functions returning a class type
instead of int then you should see the behaviour you expect where f(
f2() ) will call f(const X&&)


Re: peephole2+dead reg info after reload?

2012-03-20 Thread Aurelien Buhrig

>> Hi,
>>
>> I'm trying to make some peephole2 optimizations working (gcc 4.6.1), but
>> it seems the REG_DEAD information is lost during or after reload.
>>
>> In the following peephole2 definition, peep2_reg_dead_p returns false,
>> whereas REG_DEAD information is correctly set before reload for
>> operands[0] on the second insn:
>>
>> (define_peephole2
>>[(set (match_operand:SI 0 "nonimmediate_operand" "")
>>  (sign_extend:SI (match_operand:HI 1 "general_operand" "")))
>> (set (match_operand:PSI 2 "nonimmediate_operand" "")
>>  (truncate:PSI (match_dup 0)))]
>> "peep2_reg_dead_p(2, operands[0])"
>>[(set (match_dup 2) (sign_extend:PSI (match_dup 1)))]
>>"")
> 
> This issue doesn't seem to be related to LOCAL_REGNO like this one:
> http://gcc.gnu.org/ml/gcc/2010-10/msg00305.html
> I have no register window.
> 
> In the following example, after peepholoe2 pass, the operands[0] of
> above peephole is R0 (insns 17&19), a CALL_USED_REGISTERS, which is used
> as function arg and function value, and is in CLASS_LIKELY_SPILLED_P class.
> 
> (insn 17 15 19 2 (set (reg:SI 0 r0)
> (sign_extend:SI (reg:HI 0 r0))) {*extendhisi2_call}
>  (expr_list:REG_EQUAL (sign_extend:SI (reg:HI 0 r0 [orig:76
> MEM[(unsigned char[4] *)k_3(D) + 4B]+2 ] [76]))
> (nil)))
> 
> (insn 19 17 21 2 (set (reg/f:PSI 8 a1 [81])
> (truncate:PSI (reg:SI 0 r0 [78]))) {truncsipsi2}
>  (nil))
> 
> (insn 21 19 111 2 (set (reg/f:PSI 8 a1 [81])
> (plus:PSI (reg/f:PSI 8 a1 [81])
> (symbol_ref:PSI ("S") [flags 0x2]   S>))) {*addpsi3_1}
>  (expr_list:REG_EQUAL (plus:PSI (reg:PSI 79)
> (symbol_ref:PSI ("S") [flags 0x2]   S>))
> (nil)))
> 
> (insn 111 21 22 2 (set (reg:QI 3 r3 [82])
> (mem/s/j:QI (reg/f:PSI 8 a1 [81]) [0 S S1 A8])) {movqi}
>  (nil))
> 
> (insn 22 111 23 2 (set (reg:QI 3 r3 [82])
> (xor:QI (reg:QI 3 r3 [82])
> (mem/s/j:QI (reg/v/f:PSI 10 a3 [orig:74 k ] [74]) [0
> *k_3(D)+0 S1 A8]))) {xorqi3}
>  (nil))
> 
> (insn 23 22 24 2 (set (reg:HI 0 r0)
> (reg/v:HI 2 r2 [orig:75 rd ] [75])) {movhi_1}
>  (nil))
> 
> (insn 24 23 26 2 (set (reg:SI 0 r0)
> (sign_extend:SI (reg:HI 0 r0))) {*extendhisi2_call}
>  (expr_list:REG_EQUAL (sign_extend:SI (reg/v:HI 2 r2 [orig:75 rd ]
> [75]))
> (nil)))
> 
> (insn 26 24 28 2 (set (reg/f:PSI 8 a1 [86])
> (truncate:PSI (reg:SI 0 r0 [83]))) {truncsipsi2}
>  (nil))
> 
> 
> And before reload, insn 17&19 look like:
> 
> (insn 17 16 18 2 (set (reg:SI 0 r0)
> (sign_extend:SI (reg:HI 0 r0))) {*extendhisi2_call}
>  (expr_list:REG_EQUAL (sign_extend:SI (reg:HI 25 [ D.2124 ]))
> (nil)))
> 
> (insn 18 17 19 2 (set (reg:SI 53)
> (reg:SI 0 r0)) {*movsi_split}
>  (expr_list:REG_EQUAL (sign_extend:SI (reg:HI 25 [ D.2124 ]))
> (nil)))
> 
> (insn 19 18 20 2 (set (reg:PSI 54)
> (truncate:PSI (reg:SI 53))) {truncsipsi2}
>  (expr_list:REG_DEAD (reg:SI 53)
> (nil))
> 
>>
>> What do I miss?
>>
>> Thanks,
>> Aurélien
> 

Ping?

I noticed that mcore port implements mcore_is_dead which parses next
insns to try to figure out if a reg is actually dead.
Does it mean gcc is not always able to find dead regs? Is there a way to
make it work?
Aurélien


GCC 4.6.3 successful build on i386-apple-darwin10.8.0

2012-03-20 Thread Espen Trydal
i386-apple-darwin10.8.0

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/Cellar/gcc/4.6.3/bin/../libexec/gcc/i386-apple-darwin10.8.0/4.6.3/lto-wrapper
Target: i386-apple-darwin10.8.0
Configured with: ../gcc-4.6.3/configure
--prefix=/usr/local/Cellar/gcc46 --enable-languages=all,ada :
(reconfigured) ../gcc-4.6.3/configure --prefix=/usr/local/Cellar/gcc46
--enable-languages=all,ada --disable-multilib
Thread model: posix
gcc version 4.6.3 (GCC)

Mac OS X 10.6.8 Snow Leopard 32bit

Darwin macbook.local 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun  7
16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386

The  host/target specific installation notes did not include my
specification and I needed to configure with --disable-multilib to
compile, if not error when entering the libquadmath directory.

Cheers,
Espen


dep question in sched-deps.c

2012-03-20 Thread p z

Hello, 
 
I am confused by following piece of code in sched-deps.c. My understanding 
is the last_pending_memory_flush only holds jumps, calls and memory write. So I 
think the two invocations of add_dependence should build true dependence, not 
anti. 
 
 
 
for (u = deps->last_pending_memory_flush; u; u = XEXP (u, 1))
   {
  if (! NON_FLUSH_JUMP_P (u))
add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI); 
<---
  else if (deps_may_trap_p (x))
{
  if ((sched_deps_info->generate_spec_deps)
   && sel_sched_p () && (spec_info->mask & BEGIN_CONTROL))
{
   ds_t ds = set_dep_weak (DEP_ANTI, BEGIN_CONTROL,
  MAX_DEP_WEAK);
   note_dep (XEXP (u, 0), ds);
}
  else
add_dependence (insn, XEXP (u, 0), REG_DEP_ANTI);   
<---
}
   }
 
 
also, can you explain following comments in sched-deps.c. I don't quite 
understand what is means and what NON_FLUSH_JUMP_KIND is for.
 
/* In deps->last_pending_memory_flush marks JUMP_INSNs that weren't
   added to the list because of flush_pending_lists, stands just
   for itself and not for any other pending memory reads/writes.  */
 
 
 
I also need more dicussion about DEPS_LIST and INSN_LIST. Maxim once kindly 
explained to me,
"DEPS_LIST is a super-set of INSN_LIST. I kept INSN_LIST-style dependencies to 
avoid overhead on targets that don't need additional features of DEPS_LIST. Now 
that I look back at it, I should have removed INSN_LIST-style dependencies; I 
still hope to find time and clean that up (remove support for INSN_LIST-style 
dependencies).". But it is still over my head. my question is what is 
INSN_list-style dependencies? what is the extra feature of DEPS_LIST as a 
super-set of INSN_list?
 
thanks