gcc-4.1-20070903 is now available

2007-09-03 Thread gccadmin
Snapshot gcc-4.1-20070903 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20070903/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.1 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch 
revision 128061

You'll find:

gcc-4.1-20070903.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.1-20070903.tar.bz2 C front end and core compiler

gcc-ada-4.1-20070903.tar.bz2  Ada front end and runtime

gcc-fortran-4.1-20070903.tar.bz2  Fortran front end and runtime

gcc-g++-4.1-20070903.tar.bz2  C++ front end and runtime

gcc-java-4.1-20070903.tar.bz2 Java front end and runtime

gcc-objc-4.1-20070903.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.1-20070903.tar.bz2The GCC testsuite

Diffs from 4.1-20070827 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.1
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: RFC: Hack to make restrict more useful

2007-09-03 Thread Richard Guenther
On 9/3/07, Daniel Berlin <[EMAIL PROTECTED]> wrote:
> On 9/3/07, Richard Guenther <[EMAIL PROTECTED]> wrote:
> > On 9/3/07, Daniel Berlin <[EMAIL PROTECTED]> wrote:
> > > On 9/2/07, Paul Brook <[EMAIL PROTECTED]> wrote:
> > > > > > That said, second, my understanding of restrict, from reading the 
> > > > > > c99
> > > > > > standard, is that it is perfectly valid for restrict pointers to 
> > > > > > alias
> > > > > > each other during *loads*..  IE you can guarantee any restricted
> > > > > > pointer that is stored into can't alias the other restricted 
> > > > > > pointers.
> > > > > >  Those only used for loads can alias each other.
> > > > >
> > > > > How does it make a difference?  If for two pointers that are only
> > > > > loaded from we say they don't alias I cannot imagine a transformation
> > > > > that would get anything wrong.
> > >
> > > Easy answer: Dependence testing and then loop transforms.
> > >
> > > Given p[i] = a[i] + b[i], if you claim a and b can't alias, you will
> > > now claim that a[i] and b[i] don't access the same memory on the same
> > > iteration.
> > >
> > > We could easily use this and some profit estimation to decide to say,
> > > change the iteration space of a but not b,which will break since they
> > > really do alias, and breaking this is bad because they are allowed to.
> >
> > Eh?  Maybe I'm blind, but how can a change in iteration space that is
> > valid for the non-aliasing case be invalid for the aliasing case _if we
> > do not modify any data_?
>
> You may be right, but it just means we have to be very careful where
> we use the data if there are no modifications.
> I'm not sure the best way to go about this.  Right now, i attached
> restrict info to SSA_NAME's, and we use it in
> access_can_touch_variable, may_alias_p, and the dataref version of
> this.

I think this should be ok.

> Sadly though, it also means we can't use restricted pointers to say
> anything about non-restricted pointers unless their is modification
> either.
>
> IE int foo(int *a, restrict *b), doesn't guarantee a and b don't alias
> unless there is a modification of one of them.

Well, that is probably to make handling of contrieved "derivations" of
a restrict
pointer conservatively correct.  That is, for example

 restrict int *p;
 int pi = (int)p;
 *(int *)p = 1;

just assuming that there are derivations that PTA cannot deal with.
The way the standard is written it's just for a safe default.

Richard.


Re: RFC: Hack to make restrict more useful

2007-09-03 Thread Daniel Berlin
On 9/3/07, Richard Guenther <[EMAIL PROTECTED]> wrote:
> On 9/3/07, Daniel Berlin <[EMAIL PROTECTED]> wrote:
> > On 9/2/07, Paul Brook <[EMAIL PROTECTED]> wrote:
> > > > > That said, second, my understanding of restrict, from reading the c99
> > > > > standard, is that it is perfectly valid for restrict pointers to alias
> > > > > each other during *loads*..  IE you can guarantee any restricted
> > > > > pointer that is stored into can't alias the other restricted pointers.
> > > > >  Those only used for loads can alias each other.
> > > >
> > > > How does it make a difference?  If for two pointers that are only
> > > > loaded from we say they don't alias I cannot imagine a transformation
> > > > that would get anything wrong.
> >
> > Easy answer: Dependence testing and then loop transforms.
> >
> > Given p[i] = a[i] + b[i], if you claim a and b can't alias, you will
> > now claim that a[i] and b[i] don't access the same memory on the same
> > iteration.
> >
> > We could easily use this and some profit estimation to decide to say,
> > change the iteration space of a but not b,which will break since they
> > really do alias, and breaking this is bad because they are allowed to.
>
> Eh?  Maybe I'm blind, but how can a change in iteration space that is
> valid for the non-aliasing case be invalid for the aliasing case _if we
> do not modify any data_?

You may be right, but it just means we have to be very careful where
we use the data if there are no modifications.
I'm not sure the best way to go about this.  Right now, i attached
restrict info to SSA_NAME's, and we use it in
access_can_touch_variable, may_alias_p, and the dataref version of
this.

Sadly though, it also means we can't use restricted pointers to say
anything about non-restricted pointers unless their is modification
either.

IE int foo(int *a, restrict *b), doesn't guarantee a and b don't alias
unless there is a modification of one of them.
--Dan


Re: RFC: Hack to make restrict more useful

2007-09-03 Thread Tim Prince
Mark Mitchell wrote:
> Joseph S. Myers wrote:
> 
>> The rules that unmodified memory may alias were a deliberate change in the 
>> FDIS relative to the previous public draft; see 
>> :
> 
> That explains why I had no memory of this, despite having researched
> "restrict" pretty carefully in earlier drafts.  That also explains why
> other compilers in the field implements "restrict" as meaning "does not
> alias", independent of what's modified.
> 
> Danny, does your more comprehensive treatment of "restrict" still
> optimize test cases like the one in the PR I filed?
> 
Test cases of mine, which fail to optimize, involve a scalar argument,
performing arithmetic between a scalar and an array.  In the C case,
-fargument-noalias brings optimization, which is not promoted by
restrict keyword.  Using an explicit local copy of the argument should
work as well.  Could that be a reason for avoiding the optimization?
I think the likely use of restrict by various compilers comes into play
only when a possibly aliased variable is modified, as pointed out
previously.  Such optimization could result in "broken" results anyway,
wwhen array bounds violations occur, so I guess there's no way to check,
other than by extending bounds checking.


Re: RFC: Hack to make restrict more useful

2007-09-03 Thread Paul Brook
> In any case, I guess we should consider my patch withdrawn.  Although,
> if the new meaning of "restrict" matches standard Fortran semantics,
> then our Fortran handling must be wrong, since all my patch did was make
> us match our current Fortran semantics.

In Fortran the pointers are not exposed at the language level, so it's 
probably much harder to construct a cases where this matters.

Paul


Re: RFC: Hack to make restrict more useful

2007-09-03 Thread Mark Mitchell
Joseph S. Myers wrote:

> The rules that unmodified memory may alias were a deliberate change in the 
> FDIS relative to the previous public draft; see 
> :

That explains why I had no memory of this, despite having researched
"restrict" pretty carefully in earlier drafts.  That also explains why
other compilers in the field implements "restrict" as meaning "does not
alias", independent of what's modified.

Danny, does your more comprehensive treatment of "restrict" still
optimize test cases like the one in the PR I filed?

In any case, I guess we should consider my patch withdrawn.  Although,
if the new meaning of "restrict" matches standard Fortran semantics,
then our Fortran handling must be wrong, since all my patch did was make
us match our current Fortran semantics.

Thanks,

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


About allocating registers for instrumentation

2007-09-03 Thread 吴曦
Hi, I am working on gcc-4.1.1 and Itanium architecture. Current now I
have finished instrumenting ld and st instructions before the second
scheduling pass by reserving two global registers at backend. However,
in order to enhance the performance (e.g. make the scheduling better),
I choose to allocate two registers for each instrumentation instead of
using the reserved ones. To identify which registers I can use for
each ld and st instruction, I follow the following idea:

For each insn, I compute its live-in and live-out by starting from the
basic-block:
as we can get the live-in of the basic-block, then, for INSN(N) in the
basic-block,
  (1) live-in[ INSN(N) ] = live-out [ INSN(N-1) ]
  (2) live-out[ INSN(N) ] = (live-in [ INSN(N) ] U set)
  -(REG_DEAD U REG_UNUSED)

where set is the set of registers set by the insn, and REG_DEAD,
REG_UNUSED can be got from the insn notes.

Then, R-( live-in[INSN(N)] U live-out[INSN(N)] ) is the set of
registers I can use to instrument INSN(N). (here R is a set of
registers I specified, for example, all the caller-save global general
registers)

Am I right? or is there any thing I mis-understand, if any, please
point out, thanks!

Further, how to identify SET in (1) ? I have found many of the insns
just before the second scheduling have only one set in it. If this is
hold for all insns, I think I can use the single_set to get SET. Is
there any exception for that? thanks again

Wu


has_volatile_ops and early optimization w/o alias information

2007-09-03 Thread Richard Guenther

We set has_volatile_ops on all(?) memory references during early
optimization because we don't have alias information.  But we
do it inconsistently for loads.  For example I see

  D.2574_23 = *D.2573_22;

(no volatile) and

  D.2565_28 ={v} tab[D.2560_27].__delta;

(volatile).  Because for indirect references we also check

  /* Aliasing information is missing; mark statement as
 volatile so we won't optimize it out too actively.  */
  else if (!gimple_aliases_computed_p (cfun)
   && (flags & opf_def))
s_ann->has_volatile_ops = true;

so only add has_volatile_ops if we would create a DEF.  Now the
other place is in the generic add_virtual_operand like

  if (aliases == NULL)
{
  if (!gimple_aliases_computed_p (cfun))
s_ann->has_volatile_ops = true;

and so also marks load.  Which one is safe?  I suppose it is safe
to DCE loads even without alias information?  So I'd add the check
for a DEF also in the generic add_virtual_operand code.

Thanks for clarification.

Richard.


Re: question about rtl loop-iv analysis

2007-09-03 Thread Dorit Nuzman
Zdenek's patch here -
http://gcc.gnu.org/ml/gcc-patches/2007-08/msg02291.html - solved the
problem.
Kenny, Zdenek - many thanks for solving this issue!

dorit

> "Seongbae Park (박성배, 朴成培)" <[EMAIL PROTECTED]> wrote on
> 29/08/2007 01:01:42:
>
> > On 8/28/07, Zdenek Dvorak <[EMAIL PROTECTED]> wrote:
> > ...
> > > that obviously is not the case here, though.  Do you (or someone else
> > > responsible for df) have time to have a look at this problem?
> > > Otherwise, we may discuss it forever, but we will not get anywhere.
> > >
> > > Zdenek
> >
> > Open a PR and assign it to me, if you're not in a hurry -
> > I should be able to look at it next week.
>
> Thanks! This is PR33224
>
> dorit
>
> > --
> > #pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";
>



Re: DFA Scheduler - unable to pipeline loads

2007-09-03 Thread Maxim Kuvyrkov

Matt Lee wrote:

Hi,

I am working with GCC-4.1.1 on a simple 5-pipe stage simple scalar
RISC processors with the following description for loads and stores,

(define_insn_reservation "integer" 1
  (eq_attr "type" "branch,jump,call,arith,darith,icmp,nop")
  "issue,iu,wb")

(define_insn_reservation "load" 3
  (eq_attr "type" "load")
  "issue,iu,wb")

(define_insn_reservation "store" 1
  (eq_attr "type" "store")
  "issue,iu,wb")

I am seeing poor scheduling in Dhrystone where a memcpy call is
expanded inline.

memcpy (&dst, &src, 16) ==>

load  1, rA + 4
store 1, rB + 4
load  2, rA + 8
store 2, rB + 8
...


I agree with Adam, that this is most probably an aliasing issue.  Take a 
look at the dependency map (you can get it with -fsched-verbose=6 flag 
in your command line).



--
Maxim


Re: RFC: Hack to make restrict more useful

2007-09-03 Thread Joseph S. Myers
On Sun, 2 Sep 2007, Mark Mitchell wrote:

> Daniel Berlin wrote:
> 
> > Again, I'd love to just ignore this and say "we don't care".
> 
> Ugh.  I think you're right that the standard says that we only get to
> assume non-aliasing when the pointed-to memory is modified, so
> all-parameters-restrict is actually weaker than -fargument-noalias.  How
> unfortunate.
> 
> I've CC'd Joseph in the hopes that his C standards knowledge will
> suggest a different answer.

The rules that unmodified memory may alias were a deliberate change in the 
FDIS relative to the previous public draft; see 
:

 24  1.  The FCD specification of restrict forbids aliasing of
 25  unmodified objects.  Doing so does not promote optimization,
 26  and has other disadvantages, which are discussed in examples
 27  A-E below.  It is also contrary to the prior art in Fortran.

-- 
Joseph S. Myers
[EMAIL PROTECTED]


RE: DFA Scheduler - unable to pipeline loads

2007-09-03 Thread Ye, Joey
Matt,

I just started working on pipeline description and I'm confused one thing in 
your description.

For "integer", your cpu have a 1-cycle latency, but with 3 units stages 
"issue,iu,wb". What does that mean? My understanding is that the number of 
units seperated by "," should be equal to latency. Am I right?

Thanks - Joey

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matt Lee
Sent: 2007年9月1日 5:58
To: gcc@gcc.gnu.org
Subject: DFA Scheduler - unable to pipeline loads

Hi,

I am working with GCC-4.1.1 on a simple 5-pipe stage simple scalar
RISC processors with the following description for loads and stores,

(define_insn_reservation "integer" 1
  (eq_attr "type" "branch,jump,call,arith,darith,icmp,nop")
  "issue,iu,wb")

(define_insn_reservation "load" 3
  (eq_attr "type" "load")
  "issue,iu,wb")

(define_insn_reservation "store" 1
  (eq_attr "type" "store")
  "issue,iu,wb")

I am seeing poor scheduling in Dhrystone where a memcpy call is
expanded inline.

memcpy (&dst, &src, 16) ==>

load  1, rA + 4
store 1, rB + 4
load  2, rA + 8
store 2, rB + 8
...

Basically, instead of pipelining the loads, the current schedule
stalls the processor for two cycles on every dependent store. Here is
a dump from the .35.sched1 file.

;;   ==
;;   -- basic block 0 from 6 to 36 -- before reload
;;   ==

;;0--> 6r84=r5 :issue,iu,wb
;;1--> 13   r86=[`Ptr_Glob']   :issue,iu,wb
;;2--> 25   r92=0x5:issue,iu,wb
;;3--> 12   r85=[r84]  :issue,iu,wb
;;4--> 14   r87=[r86]  :issue,iu,wb
;;7--> 15   [r85]=r87  :issue,iu,wb
;;8--> 16   r88=[r86+0x4]  :issue,iu,wb
;;   11--> 17   [r85+0x4]=r88  :issue,iu,wb
;;   12--> 18   r89=[r86+0x8]  :issue,iu,wb
;;   15--> 19   [r85+0x8]=r89  :issue,iu,wb
;;   16--> 20   r90=[r86+0xc]  :issue,iu,wb
;;   19--> 21   [r85+0xc]=r90  :issue,iu,wb
;;   20--> 22   r91=[r86+0x10] :issue,iu,wb
;;   23--> 23   [r85+0x10]=r91 :issue,iu,wb
;;   24--> 26   [r84+0xc]=r92  :issue,iu,wb
;;   25--> 31   clobber r3 :nothing
;;   25--> 36   use r3 :nothing
;;  Ready list (final):
;;   total time = 25
;;   new head = 7
;;   new tail = 36

There is an obvious better schedule to be obtained. Here is one such
(hand-modified) schedule which just pipelines two of the loads to
obtain a shorter critical path length to the whole function (function
has only bb 0)

;;0--> 6r84=r5 :issue,iu,wb
;;1--> 13   r86=[`Ptr_Glob']   :issue,iu,wb
;;2--> 25   r92=0x5:issue,iu,wb
;;3--> 12   r85=[r84]  :issue,iu,wb
;;4--> 14   r87=[r86]  :issue,iu,wb
;;7--> 15   [r85]=r87  :issue,iu,wb
;;8--> 16   r88=[r86+0x4]  :issue,iu,wb
;;9--> 18   r89=[r86+0x8]  :issue,iu,wb
;;   10--> 20   r90=[r86+0xc]  :issue,iu,wb
;;   11--> 17   [r85+0x4]=r88  :issue,iu,wb
;;   12--> 19   [r85+0x8]=r89  :issue,iu,wb
;;   13--> 21   [r85+0xc]=r90  :issue,iu,wb
;;   14--> 22   r91=[r86+0x10] :issue,iu,wb
;;   17--> 23   [r85+0x10]=r91 :issue,iu,wb
;;   18--> 26   [r84+0xc]=r92  :issue,iu,mb_wb
;;   19--> 31   clobber r3 :nothing
;;   20--> 36   use r3 :nothing
;;  Ready list (final):
;;   total time = 20
;;   new head = 7
;;   new tail = 36

This schedule is 5 cycles faster.

I have read and re-read the material surrounding the DFA scheduler. I
understand that the heuristics optimize critical path length and not
stalls or other metrics. But in this case it is precisely the critical
path length that is shortened by the better schedule. I have been
examining various hooks available and for a while it seemed like
TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD must be set to a
larger window to look for better candidates to schedule into the ready
queue. For instance, this discussion seems to say so.
http://gcc.gnu.org/ml/gcc/2002-05/msg01132.html

But a post that follows soon after seems to imply otherwise.
http://gcc.gnu.org/ml/gcc/2002-05/msg01388.html

Both posts are from Vladimir. In any case the final conclusion seems
to be that the lookahead is useful only for multi-

Re: RFC: Hack to make restrict more useful

2007-09-03 Thread Richard Guenther
On 9/3/07, Daniel Berlin <[EMAIL PROTECTED]> wrote:
> On 9/2/07, Paul Brook <[EMAIL PROTECTED]> wrote:
> > > > That said, second, my understanding of restrict, from reading the c99
> > > > standard, is that it is perfectly valid for restrict pointers to alias
> > > > each other during *loads*..  IE you can guarantee any restricted
> > > > pointer that is stored into can't alias the other restricted pointers.
> > > >  Those only used for loads can alias each other.
> > >
> > > How does it make a difference?  If for two pointers that are only
> > > loaded from we say they don't alias I cannot imagine a transformation
> > > that would get anything wrong.
>
> Easy answer: Dependence testing and then loop transforms.
>
> Given p[i] = a[i] + b[i], if you claim a and b can't alias, you will
> now claim that a[i] and b[i] don't access the same memory on the same
> iteration.
>
> We could easily use this and some profit estimation to decide to say,
> change the iteration space of a but not b,which will break since they
> really do alias, and breaking this is bad because they are allowed to.

Eh?  Maybe I'm blind, but how can a change in iteration space that is
valid for the non-aliasing case be invalid for the aliasing case _if we
do not modify any data_?

Ok, Marks example is the only one we could get wrong.  But we simply
can not derive value ranges from restrict qualification.

Richard.