date:20070620

Hi,

As I learned from experience, gcc always assume independence between memory 
references in the following program:

typedef struct {
    int m1;
    int m2;
} S1;

void foo(S1 *p1, S1 *p2) {
    ... = p1-m1;
    ... = p2-m2;
}

...even if -fno-strict-aliasing (an option disabling ansi-aliasing rules) 
supplied.

I wonder, is there a way to force gcc not to assume independence in the example 
shown above?

Yours,
Andrey

RE: How to supress a specific kind of ansi-aliasing rules?

2007-06-20 Thread Dave Korn

On 20 June 2007 11:36, Bokhanko, Andrey S wrote:

 Hi,
 
 As I learned from experience, gcc always assume independence between memory
 references in the following program: 
 
 typedef struct {
     int m1;
     int m2;
 } S1;
 
 void foo(S1 *p1, S1 *p2) {
     ... = p1-m1;
     ... = p2-m2;
 }
 
 ...even if -fno-strict-aliasing (an option disabling ansi-aliasing rules)
 supplied. 
 
 I wonder, is there a way to force gcc not to assume independence in the
 example shown above? 

  Just maybe if you give the struct definition __attribute__ ((packed)) gcc
will no longer assume that all struct S1s are naturally aligned and therefore
know that they might overlap?  (But it's also quite possible that it won't...
I haven't been through the sources to determine how much aliasing information
it infers from alignment.)

cheers,
  DaveK
-- 
Can't think of a witty .sigline today

Re: How to supress a specific kind of ansi-aliasing rules?

On 6/20/07 6:35 AM, Bokhanko, Andrey S wrote:

 As I learned from experience, gcc always assume independence between memory 
 references in the following program:
 
 typedef struct {
 int m1;
 int m2;
 } S1;
 
 void foo(S1 *p1, S1 *p2) {
 ... = p1-m1;
 ... = p2-m2;
 }
 
 ...even if -fno-strict-aliasing (an option disabling ansi-aliasing rules) 
 supplied.

No, it doesn't.  Both p1-m1 and p2-m2 will use the same memory tag in
GIMPLE and the same alias set during RTL.  Notice how a store between
the two loads affects the second load:

  # VUSE SMT.4_7(D)
  x_2 = p1_1(D)-m1;

  # SMT.4_8 = VDEF SMT.4_7(D)
  p1_1(D)-m1 = 32;

  # VUSE SMT.4_8
  y_4 = p2_3(D)-m2;


Or did you mean that p1 and p2 should *not* interfere with each other?

Re: Severe increase in compilation time with 4.3.0 20070615 on powerpc-apple-darwin7

2007-06-20 Thread Dominique Dhumieres

Exctracted from 
http://www.suse.de/~gcctest/c++bench/polyhedron/polyhedron-summary.txt:

date   compile execute

070608  106.29  628.74
070615  117.43  629.73
070620  105.95  616.99

So these tests show a ~10% increase of the compilation time
from 070612 to 070618.  I have forgotten to mention that my timings
were done on a 1.8Ghz G5 with 512k of cache. My observation is that
when something goes wrong on the x86 family it is several time worse
on the PPC. This is particularly important when cache misses are involved.

So I'll wait for the next snapshot and report what I see.

Cheers

Dominique

RE: How to supress a specific kind of ansi-aliasing rules?

Diego Novillo wrote:

 No, it doesn't.  Both p1-m1 and p2-m2 will use the same memory tag
in
 GIMPLE and the same alias set during RTL.  Notice how a store between
 the two loads affects the second load:
 
   # VUSE SMT.4_7(D)
   x_2 = p1_1(D)-m1;
 
   # SMT.4_8 = VDEF SMT.4_7(D)
   p1_1(D)-m1 = 32;
 
   # VUSE SMT.4_8
   y_4 = p2_3(D)-m2;
 
 
 Or did you mean that p1 and p2 should *not* interfere with each other?

Hmmm... Actually, I compiled the following program:

typedef struct {
int s1_m1;
int s1_m2;
} S1;

void foo(S1 *p1, S1 *p2) {
p2-s1_m1 = p1-s1_m2 * 11;
/* If ansi-aliasing happens, this MUL shold be removed. */
p2-s1_m1 = p1-s1_m2 * 11;

return;
}

with:

gcc4 -c -O2 -fno-strict-aliasing test.c

and the resulting assembly file had only one LOAD, one STORE and one
implementation of MUL. This optimization is only possible if compiler
able to prove independence between p2-s1_m1 and p1-s1_m2. I never
looked at gcc's internal dumps.

For the record: gcc 4.2.0 on IA64.

Yours,
Andrey

RE: How to supress a specific kind of ansi-aliasing rules?

Actually, I'm interested in how to force conservative analysis *without*
source code modifications (only with compiler's options).

Yours,
Andrey

Re: m68k bootstrap problem

Hi,

On Tue, 19 Jun 2007, Kenneth Zadeck wrote:

 The reason that there is no reg_dead not in the last use (insn 45)
 before the sib_call (insn 46)  is that there is no def for r0 in the
 sibcall (insn 46) and r0 is live at the end of the block.
 
 This of course changes the question to not why there no note to why is
 there no def.

Below is a possible solution I found, if there weren't that comment... :)

One problem here might be that exit_block_uses includes the return 
register, but that's not exactly true for abnormal exits, where the return 
value is not provided by the function itself. The patch below clears these 
register if it gets to the exit via a sibcall edge, but there may be 
other cases as well.

Looking at the exit_block_uses usage I'm a little confused about this 
expression, which is used quite a bit:

(SIBLING_CALL_P (insn)  bitmap_bit_p (df-exit_block_uses, dregno)
 !refers_to_regno_p (dregno, dregno+1, current_function_return_rtx, 
(rtx *)0)))

I don't quite understand the point of this, at least on m68k this is 
pretty much a no-op. exit_block_uses contains: 0 [%d0] 8 [%a0] 14 [%a6] 15 
[%sp], return_rtx contains %d0/%a0, so for %d0/%a0 it's always false and 
%a6/%sp don't appear in call defs, so I don't understand this special 
casing of sibcalls.

Another question I have is about DF_REF_MAY_CLOBBER, any function call
would also clobber the return value and I see defs generated for calls, 
but they are only marked with DF_REF_MAY_CLOBBER and thus the use chain 
isn't broken by calls. Why is that? The header file doesn't go into any 
details what better information it needs.

bye, Roman

Index: gcc/df-problems.c
===
--- gcc/df-problems.c   (revision 125811)
+++ gcc/df-problems.c   (working copy)
@@ -1574,7 +1574,7 @@
   /* Call-clobbered registers die across exception and call edges.  */
   /* ??? Abnormal call edges ignored for the moment, as this gets
  confused by sibling call edges, which crashes reg-stack.  */
-  if (e-flags  EDGE_EH)
+  if ((e-flags  EDGE_EH) || (e-flags  EDGE_SIBCALL))
 bitmap_ior_and_compl_into (op1, op2, df_invalidated_by_call);
   else
 bitmap_ior_into (op1, op2);

[PATCH][RFC] Re-structure tree_ssa_useless_type_conversion_1 to work towards a middle-end type system


(cross-posting because discussion may be interesting to others)

The following is a patch that re-structures 
tree_ssa_useless_type_conversion_1 (without changing its semantics) to
make it easier to read and fix towards not requiring the 
types_compatible_p langhook.  Several places that have problems right
now are marked (and I have patches for some of the in the queue).

The idea is (suggested by DannyB) to implicitly define our middle-end
type-system by means of this function.  A separately posted patch
works towards this by carefully replacing all remaining calls to the
types_compatible_p langhook by proper calls to 
tree_ssa_useless_type_conversion_1.

There is one key invariant of tree_ssa_useless_type_conversion_1 that
we need to make sure it holds.  tree_ssa_useless_type_conversion_1 shall 
be transitive so that if tree_ssa_useless_type_conversion_1 (a, b) and
tree_ssa_useless_type_conversion_1 (b, c) then
tree_ssa_useless_type_conversion_1 (a, c) will also hold.

I think that forcing it to be communtative would be not useful but only
will cause more explicit conversions to pop up.


There are a few ??? in the patch below which I'll try to go through
one by one:


  /* Preserve changes in the types minimum or maximum value.
 ???  Due to the way we handle sizetype as signed we need 
 to jump through hoops here to make sizetype and size_type_node
 compatible.  */
  if (!tree_int_cst_equal (fold_convert (outer_type,
 TYPE_MIN_VALUE (inner_type)),
   TYPE_MIN_VALUE (outer_type))
  || !tree_int_cst_equal (fold_convert (outer_type,
TYPE_MAX_VALUE 
(inner_type)),
  TYPE_MAX_VALUE (outer_type)))
return false;

with pointer_plus we assert that the offset is compatible with sizetype.
But while in principle sizetype and size_type_node (where the problem
arises) should be compatible, they are not as because of TYPE_IS_SIZETYPE 
(sizetype) and sign-extending sizetypes they differ in their 
TYPE_MAX_VALUE (sizetype is sign-extended, size_type_node is 
zero-extended).  I tried to get rid of TYPE_IS_SIZETYPE completely, but
there are some frontend issues that need to be worked out.


  /* ???  We might want to preserve base type changes because of
 TBAA.  Or we need to be extra careful below.  */

To get rid of some of the hacks in tree-ssa-copy.c:may_propagate_copy
we would need to make sure to not trivially convert (long *) to (int *)
if long and int have the same mode.  I'm not sure about this.


  /* If the outer type is (void *), then the conversion is not
 necessary.
 ???  This makes tree_ssa_useless_type_conversion_1 not
 transitive.  */
  if (TREE_CODE (TREE_TYPE (outer_type)) == VOID_TYPE)
return true;

while this special case makes tons of sense, it conflicts with the
current implementation of the langhook.  This way A* = (void *)B*
is reduced to A* = B* which is not a trivial conversion (and thus
this violates transitivity).  My bet is that the issue goes away
once we stop calling the langhook from tree_ssa_useless_type_conversion_1.


  /* Otherwise pointers/references are equivalent if their pointed
 to types are effectively the same.  This allows to strip 
conversions
 between pointer types with different type qualifiers.
 ???  We should recurse here with
 tree_ssa_useless_type_conversion_1.  */
  return lang_hooks.types_compatible_p (TREE_TYPE (inner_type),
TREE_TYPE (outer_type));

This is the place where the TBAA problem from above will pop up if
we recurse with tree_ssa_useless_type_conversion_1 here.  Possibly
the solution is to split tree_ssa_useless_type_conversion_1 and
handle pointers specially.


  /* Fall back to what the frontend thinks of type compatibility.
 ???  This should eventually just return false.  */
  return lang_hooks.types_compatible_p (inner_type, outer_type);

Before we can return false here we need to handle some more cases
in this function.  I have a patch that adds all missing trivial stuff
but not doing structural equivalence checks - I'm not sure we really
need these.


Comments?

Thanks,
Richard.


2007-06-20  Richard Guenther  [EMAIL PROTECTED]

* tree-ssa.c (tree_ssa_useless_type_conversion_1): Document
future intent.  Re-structure to call langhook last, mark
questionable parts.

Index: tree-ssa.c
===
*** tree-ssa.c.orig 2007-06-20 13:50:27.0 +0200
--- tree-ssa.c  2007-06-20 13:51:03.0 +0200
*** delete_tree_ssa (void)
*** 888,894 
  
  
  /* Return true if the conversion from INNER_TYPE to OUTER_TYPE is a
!useless type conversion, otherwise return false.  */
  
  bool
  tree_ssa_useless_type_conversion_1

Re: Severe increase in compilation time with 4.3.0 20070615 on powerpc-apple-darwin7


On 6/20/07, Dominique Dhumieres [EMAIL PROTECTED] wrote:

Exctracted from 
http://www.suse.de/~gcctest/c++bench/polyhedron/polyhedron-summary.txt:

date   compile execute

070608  106.29  628.74
070615  117.43  629.73
070620  105.95  616.99

So these tests show a ~10% increase of the compilation time
from 070612 to 070618.  I have forgotten to mention that my timings
were done on a 1.8Ghz G5 with 512k of cache. My observation is that
when something goes wrong on the x86 family it is several time worse
on the PPC. This is particularly important when cache misses are involved.

So I'll wait for the next snapshot and report what I see.


So this looks like the df-branch merge where a increase in compile-time was
expected.  Maybe you can identify the single most increase for ppc?

Richard.

Re: m68k bootstrap problem

Roman Zippel wrote:
 Hi,

 On Tue, 19 Jun 2007, Kenneth Zadeck wrote:

   

 Another question I have is about DF_REF_MAY_CLOBBER, any function call
 would also clobber the return value and I see defs generated for calls, 
 but they are only marked with DF_REF_MAY_CLOBBER and thus the use chain 
 isn't broken by calls. Why is that? The header file doesn't go into any 
 details what better information it needs.

   
For certain regs, the subroutine may or may not modify the value.  The
better information alluded to is information that one might get by doing
interprocedural analysis.  Without such information you have to assume
that the value may or may not survive.  The treatment of these vars is
thus conservative, but correct.

The only def that this is not true for in a call is the return value. 

I am going to let bonzini respond to the rest of this because i am not
familiar with enough of this to approve or disapprove it.

kenny


 bye, Roman

 Index: gcc/df-problems.c
 ===
 --- gcc/df-problems.c (revision 125811)
 +++ gcc/df-problems.c (working copy)
 @@ -1574,7 +1574,7 @@
/* Call-clobbered registers die across exception and call edges.  */
/* ??? Abnormal call edges ignored for the moment, as this gets
   confused by sibling call edges, which crashes reg-stack.  */
 -  if (e-flags  EDGE_EH)
 +  if ((e-flags  EDGE_EH) || (e-flags  EDGE_SIBCALL))
  bitmap_ior_and_compl_into (op1, op2, df_invalidated_by_call);
else
  bitmap_ior_into (op1, op2);

Re: m68k bootstrap problem

Hi,

On Wed, 20 Jun 2007, Kenneth Zadeck wrote:

 For certain regs, the subroutine may or may not modify the value.  The
 better information alluded to is information that one might get by doing
 interprocedural analysis.  Without such information you have to assume
 that the value may or may not survive.  The treatment of these vars is
 thus conservative, but correct.

I don't understand, wouldn't the consertive approach be that the value 
simply doesn't survive?

bye, Roman

Re: Severe increase in compilation time with 4.3.0 20070615 on powerpc-apple-darwin7

2007-06-20 Thread Dominique Dhumieres

 Maybe you can identify the single most increase for ppc?

The ranking is:

 compile 
 
 06/15 06/08%
 
channel  4.289 2.519   70
induct  36.87823.671   56
protein 20.09713.162   53
nf   5.412 3.629   49
fatigue 12.843 8.733   47
ac   6.098 4.169   46
gas_dyn 11.856 8.168   45
test_fpu17.09812.034   42
rnflow  19.81214.370   38
doduc   37.70028.040   34
capacita 7.332 5.542   32
aermod 266.403   205.639   30
tfft 2.530 1.938   31
mdbx 9.052 7.036   28
air 14.77311.925   24
linpk2.419 1.979   22

% = 100*(t0615/t0608-1)

So the worst increase is for channel (ironically this has always been
the strongest gfortran result!), but all tests show an increase above
20%.

Dominique

Re: How to supress a specific kind of ansi-aliasing rules?

On 6/20/07 7:52 AM, Bokhanko, Andrey S wrote:

 typedef struct {
 int s1_m1;
 int s1_m2;
 } S1;
 
 void foo(S1 *p1, S1 *p2) {
 p2-s1_m1 = p1-s1_m2 * 11;
 /* If ansi-aliasing happens, this MUL shold be removed. */
 p2-s1_m1 = p1-s1_m2 * 11;
 
 return;
 }
 
 with:
 
 gcc4 -c -O2 -fno-strict-aliasing test.c
 
 and the resulting assembly file had only one LOAD, one STORE and one
 implementation of MUL. This optimization is only possible if compiler
 able to prove independence between p2-s1_m1 and p1-s1_m2. I never
 looked at gcc's internal dumps.

Structural analysis let's you prove that stores to fields m1 and m2 may
never overlap.  They're always at different offsets, even if p1 and p2
point to the same area.

Re: m68k bootstrap problem

Roman Zippel wrote:
 Hi,

 On Wed, 20 Jun 2007, Kenneth Zadeck wrote:

   
 For certain regs, the subroutine may or may not modify the value.  The
 better information alluded to is information that one might get by doing
 interprocedural analysis.  Without such information you have to assume
 that the value may or may not survive.  The treatment of these vars is
 thus conservative, but correct.
 

 I don't understand, wouldn't the consertive approach be that the value 
 simply doesn't survive?

 bye, Roman
   
No, the conservative is that we do not know anything.  it could be
destroyed and it could not be destroyed. 

Better information would tell us definitively that it there was one of
two possible outcomes, the value is destroyed by the call or was not
destroyed and passes thru unscathed.

When we do forwards analysis around a call, the may def does kill the
value, i.e. we cannot reliably use the value after the call.

Re: preventing -m options being passed to the compiler

2007-06-20 Thread Andrew Pinski


On 6/20/07, Ben Elliston [EMAIL PROTECTED] wrote:

To now answer my own question (for the benefit of others): the CC1_SPEC
string can include the sequence %moption* to strip those options from
the command line.


This is not a good way, the best way is to create a dumby -moption in
the target.opt file so you get the documentation with --help.

-- Pinski

Re: Severe increase in compilation time with 4.3.0 20070615 on powerpc-apple-darwin7


On 6/20/07, Dominique Dhumieres [EMAIL PROTECTED] wrote:

 Maybe you can identify the single most increase for ppc?

The ranking is:

 compile

 06/15 06/08%

channel  4.289 2.519   70
induct  36.87823.671   56
protein 20.09713.162   53
nf   5.412 3.629   49
fatigue 12.843 8.733   47
ac   6.098 4.169   46
gas_dyn 11.856 8.168   45
test_fpu17.09812.034   42
rnflow  19.81214.370   38
doduc   37.70028.040   34
capacita 7.332 5.542   32
aermod 266.403   205.639   30
tfft 2.530 1.938   31
mdbx 9.052 7.036   28
air 14.77311.925   24
linpk2.419 1.979   22

% = 100*(t0615/t0608-1)

So the worst increase is for channel (ironically this has always been
the strongest gfortran result!), but all tests show an increase above
20%.


Btw, is this a compiler with checking enabled?  If so try comparing
numbers with --enable-checking=release.

Richard.

RE: How to supress a specific kind of ansi-aliasing rules?


 Structural analysis let's you prove that stores to fields m1 and m2
may
 never overlap.  They're always at different offsets, even if p1 and p2
 point to the same area.

Yes, but one can write something like this:

p2 = (S1 *)p1-s1_m2;

Of course, this is a blatant violation of ANSI C standard, etc. Still, a
perfectly acceptable C code.

With violations of other ANSI aliasing rules, one has an exit:
-fno-strict-aliasing option. No so in this case.

Or is it?

Yours,
Andrey

Re: How to supress a specific kind of ansi-aliasing rules?

On 6/20/07 9:09 AM, Bokhanko, Andrey S wrote:

 Yes, but one can write something like this:
 
 p2 = (S1 *)p1-s1_m2;
 
 Of course, this is a blatant violation of ANSI C standard, etc. Still, a
 perfectly acceptable C code.

No, it isn't.  GCC only tries to DTRT on standard compliant code.

 With violations of other ANSI aliasing rules, one has an exit:
 -fno-strict-aliasing option. No so in this case.

We don't have a switch to disable CSE at the RTL level (which is the
pass doing this structural analysis).  Volatile is just about the only
thing that will help you here.

Having said that, maybe we could consider having CSE not doing this with
-fno-strict-aliasing, but I'm not sure if it's a good idea.  What do
others think?

Re: How to supress a specific kind of ansi-aliasing rules?

2007-06-20 Thread Daniel Jacobowitz

On Wed, Jun 20, 2007 at 09:26:39AM -0400, Diego Novillo wrote:
 Having said that, maybe we could consider having CSE not doing this with
 -fno-strict-aliasing, but I'm not sure if it's a good idea.  What do
 others think?

I haven't seen a useful reason in this thread why you would want to do
so; and I don't think it has anything to do with aliasing, so it
shouldn't be grouped there.

-- 
Daniel Jacobowitz
CodeSourcery

Re: m68k bootstrap problem

Hi,

On Wed, 20 Jun 2007, Kenneth Zadeck wrote:

  I don't understand, wouldn't the consertive approach be that the value 
  simply doesn't survive?
 
 No, the conservative is that we do not know anything.  it could be
 destroyed and it could not be destroyed. 

What is the value of this? If we don't know anything, we can't use the 
value anymore, since it may be destroyed, so effectively we have to 
assume the value is destroyed.

 Better information would tell us definitively that it there was one of
 two possible outcomes, the value is destroyed by the call or was not
 destroyed and passes thru unscathed.

No argument here, but currently this information is not available, so 
parts of the compiler assume the register don't survive.

If the register were correctly marked as clobbered I wouldn't have the 
current problem, e.g. reload needs a definitive answer, whether the 
register survives a call, so it uses call_used_reg_set for that, which 
conflicts with the current vague life information.

bye, Roman

Re: [PATCH][RFC] Re-structure tree_ssa_useless_type_conversion_1 to work towards a middle-end type system

2007-06-20 Thread Michael Matz

Hi,

On Wed, 20 Jun 2007, Richard Guenther wrote:

   /* If the outer type is (void *), then the conversion is not
  necessary.
  ???  This makes tree_ssa_useless_type_conversion_1 not
  transitive.  */

Not this line itself makes it not transitive, but the fact that it still 
relies on the frontends langhooks makes it so.  Document that fact so it's 
clear that when the final goal is implemented (langhook removed) this 
doesn't violate transitivity.

   if (TREE_CODE (TREE_TYPE (outer_type)) == VOID_TYPE)
 return true;
 
   /* Return true if the conversion from INNER_TYPE to OUTER_TYPE is a
 !useless type conversion, otherwise return false.
 !This function implicitly defines the middle-end type system.  The
 !following invariants shall be fulfilled:
 ! 
 !  1) tree_ssa_useless_type_conversion_1 is transitive.  If
 ! a  b and b  c then a  c.
 ! 
 !  2) tree_ssa_useless_type_conversion_1 is not communtative.
 ! From a  b does not follow a  b.
 ! 
 !  3) Conversions are useless only if with the resulting type

Something is missing in this sentence.  Perhaps if _values_ with the 
resulting type ...?  The whole sentence reads strange, though.  
Operations are not applied to types but to values.  Types actually _are_ 
exactly the set of operations applicable to values.

Perhaps reformulate the whole thing to something like:

3) Types define the available set of operations applicable to values.  A 
   type conversion is useless if the operations for the target type is a 
   subset of the operations for the source type.  For example casts to 
   void* are useless, casts from void* are not (void* can't be 
   dereferenced or offsetted, but copied, hence its set of operations is 
   a strict subset of that of all other data pointer types).  Casts to
   const T* are useless (can't be written to), casts from const T* to T* 
   are not.


Ciao,
Michael.

Re: How to supress a specific kind of ansi-aliasing rules?

2007-06-20 Thread Ian Lance Taylor

Diego Novillo [EMAIL PROTECTED] writes:

 On 6/20/07 9:09 AM, Bokhanko, Andrey S wrote:
 
  Yes, but one can write something like this:
  
  p2 = (S1 *)p1-s1_m2;
  
  Of course, this is a blatant violation of ANSI C standard, etc. Still, a
  perfectly acceptable C code.
 
 No, it isn't.  GCC only tries to DTRT on standard compliant code.
 
  With violations of other ANSI aliasing rules, one has an exit:
  -fno-strict-aliasing option. No so in this case.
 
 We don't have a switch to disable CSE at the RTL level (which is the
 pass doing this structural analysis).  Volatile is just about the only
 thing that will help you here.
 
 Having said that, maybe we could consider having CSE not doing this with
 -fno-strict-aliasing, but I'm not sure if it's a good idea.  What do
 others think?

I think that would be a bad idea.

As far as I can see, if we want to support code like the above, we
would have to add an option to not reorder any memory loads or stores
through different pointers.  It shouldn't be part of
-fno-strict-aliasing.

Ian

Re: [PATCH][RFC] Re-structure tree_ssa_useless_type_conversion_1 to work towards a middle-end type system

On Wed, 20 Jun 2007, Michael Matz wrote:

 Hi,
 
 On Wed, 20 Jun 2007, Richard Guenther wrote:
 
/* If the outer type is (void *), then the conversion is not
   necessary.
   ???  This makes tree_ssa_useless_type_conversion_1 not
   transitive.  */
 
 Not this line itself makes it not transitive, but the fact that it still 
 relies on the frontends langhooks makes it so.  Document that fact so it's 
 clear that when the final goal is implemented (langhook removed) this 
 doesn't violate transitivity.

Done.
 
if (TREE_CODE (TREE_TYPE (outer_type)) == VOID_TYPE)
  return true;
  
/* Return true if the conversion from INNER_TYPE to OUTER_TYPE is a
  !useless type conversion, otherwise return false.
  !This function implicitly defines the middle-end type system.  The
  !following invariants shall be fulfilled:
  ! 
  !  1) tree_ssa_useless_type_conversion_1 is transitive.  If
  !   a  b and b  c then a  c.
  ! 
  !  2) tree_ssa_useless_type_conversion_1 is not communtative.
  !   From a  b does not follow a  b.
  ! 
  !  3) Conversions are useless only if with the resulting type
 
 Something is missing in this sentence.  Perhaps if _values_ with the 
 resulting type ...?  The whole sentence reads strange, though.  
 Operations are not applied to types but to values.  Types actually _are_ 
 exactly the set of operations applicable to values.
 
 Perhaps reformulate the whole thing to something like:
 
 3) Types define the available set of operations applicable to values.  A 
type conversion is useless if the operations for the target type is a 
subset of the operations for the source type.  For example casts to 
void* are useless, casts from void* are not (void* can't be 
dereferenced or offsetted, but copied, hence its set of operations is 
a strict subset of that of all other data pointer types).  Casts to
const T* are useless (can't be written to), casts from const T* to T* 
are not.

I've copied your rewrite.

Thanks,
Richard.

Re: [PATCH][RFC] Re-structure tree_ssa_useless_type_conversion_1 to work towards a middle-end type system

2007-06-20 Thread Andrew Pinski


On 6/20/07, Michael Matz [EMAIL PROTECTED] wrote:

Hi,

On Wed, 20 Jun 2007, Richard Guenther wrote:

   /* If the outer type is (void *), then the conversion is not
  necessary.
  ???  This makes tree_ssa_useless_type_conversion_1 not
  transitive.  */

Not this line itself makes it not transitive, but the fact that it still
relies on the frontends langhooks makes it so.  Document that fact so it's
clear that when the final goal is implemented (langhook removed) this
doesn't violate transitivity.


Huh?  Yes it does violate transivity.  int *a; void *b;
b = a; vs a = (int*)b;  (this is IR form I am talking about).

-- Pinski

Re: [PATCH][RFC] Re-structure tree_ssa_useless_type_conversion_1 to work towards a middle-end type system

2007-06-20 Thread Michael Matz

Hi,

On Wed, 20 Jun 2007, Andrew Pinski wrote:

 On 6/20/07, Michael Matz [EMAIL PROTECTED] wrote:
  Hi,
 
  On Wed, 20 Jun 2007, Richard Guenther wrote:
 
 /* If the outer type is (void *), then the conversion is not
necessary.
???  This makes tree_ssa_useless_type_conversion_1 not
transitive.  */
 
  Not this line itself makes it not transitive, but the fact that it still
  relies on the frontends langhooks makes it so.  Document that fact so it's
  clear that when the final goal is implemented (langhook removed) this
  doesn't violate transitivity.
 
 Huh?  Yes it does violate transivity.  int *a; void *b;
 b = a; vs a = (int*)b;  (this is IR form I am talking about).

That example shows that uselessness is not symmetric, it doesn't seem to 
talk about transitivity, but we already knew that this relation isn't 
symmetric.  And no, regarding conversions _to_ void* as useless in itself 
doesn't destroy transitivity anywhere.  Except that currently the whole 
thing still uses the langhook, which sometimes regards conversion _from_ 
void* as useless, which obviously is broken.


Ciao,
Michael.

Re: How to supress a specific kind of ansi-aliasing rules?

2007-06-20 Thread Joseph S. Myers

On Wed, 20 Jun 2007, Daniel Jacobowitz wrote:

 On Wed, Jun 20, 2007 at 09:26:39AM -0400, Diego Novillo wrote:
  Having said that, maybe we could consider having CSE not doing this with
  -fno-strict-aliasing, but I'm not sure if it's a good idea.  What do
  others think?
 
 I haven't seen a useful reason in this thread why you would want to do
 so; and I don't think it has anything to do with aliasing, so it
 shouldn't be grouped there.

We don't document what the -fstrict-aliasing dialect is.  I think the 
current version is

- like standard C or C++ but without the type-based alias rules

and the version desired by the original poster is

- each toplevel object (declared or allocated) is a block of bytes, and 
arbitrary pointer arithmetic can be used within that block and any part of 
it accessed with any type, as long as pointers remain properly aligned and 
you don't go outside the original object.

If you eliminate the type-based alias rules, the only thing stopping the 
rules from being the second version is probably that most conversions 
between pointer types don't have specified semantics.  Both versions may 
well be useful.

-- 
Joseph S. Myers
[EMAIL PROTECTED]

Type system functions to their own file?

2007-06-20 Thread Giovanni Bajo


Hi Richard,

what about moving all the type-system related functions to a new file, 
eg: tree-ssa-type.c? I think that makes the intent even clearer.

--
Giovanni Bajo

Re: Type system functions to their own file?

On Wed, 20 Jun 2007, Giovanni Bajo wrote:

 Hi Richard,
 
 what about moving all the type-system related functions to a new file, eg:
 tree-ssa-type.c? I think that makes the intent even clearer.

While this is surely a good idea I'd like to do this separately to not
make diffs unnecessarily larger.  Also it is not yet clear where to
draw the line ;)

Richard.

RE: [PATCH][RFC] Re-structure tree_ssa_useless_type_conversion_1 to work towards a middle-end type system

2007-06-20 Thread Dave Korn

On 20 June 2007 15:25, Andrew Pinski wrote:

 On 6/20/07, Michael Matz [EMAIL PROTECTED] wrote:
 Hi,
 
 On Wed, 20 Jun 2007, Richard Guenther wrote:
 
   /* If the outer type is (void *), then the conversion is not   
  necessary. ???  This makes tree_ssa_useless_type_conversion_1 not
  transitive.  */
 
 Not this line itself makes it not transitive, but the fact that it still
 relies on the frontends langhooks makes it so.  Document that fact so it's
 clear that when the final goal is implemented (langhook removed) this
 doesn't violate transitivity.
 
 Huh?  Yes it does violate transivity.  int *a; void *b;
 b = a; vs a = (int*)b;  (this is IR form I am talking about).
 
 -- Pinski


  That's commutativity.  Transitivity would mean that if you can assign a to b 
without a conversion, and you can assign b to c without a conversion, then you 
can assign a to c without a conversion.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today

Re: [PATCH][RFC] Re-structure tree_ssa_useless_type_conversion_1 to work towards a middle-end type system

2007-06-20 Thread Eric Botcazou

   /* Return true if the conversion from INNER_TYPE to OUTER_TYPE is a
 !useless type conversion, otherwise return false.
 !This function implicitly defines the middle-end type system.  The
 !following invariants shall be fulfilled:
 !
 !  1) tree_ssa_useless_type_conversion_1 is transitive.  If
 ! a  b and b  c then a  c.
 !
 !  2) tree_ssa_useless_type_conversion_1 is not communtative.
 ! From a  b does not follow a  b.
 !
 !  3) Conversions are useless only if with the resulting type
 ! can be used in a subset of the operations the original type
 ! can be applied to.  For example casts to void* are useless,
 ! casts from void* not.  Casts to const T* are useless, casts
 ! from const T* to T* not.

My understanding is that it's a relation, not an operation, so the proper term 
for 2 would be symmetric.  For the sake of completeness, you could also add 
that it's reflexive: a  a holds.

Which would suggest to find a better symbol than '' for it. :-)

-- 
Eric Botcazou

Re: [PATCH][RFC] Re-structure tree_ssa_useless_type_conversion_1 to work towards a middle-end type system

2007-06-20 Thread Michael Matz

Hi,

On Wed, 20 Jun 2007, Eric Botcazou wrote:

  !  2) tree_ssa_useless_type_conversion_1 is not communtative.
  !   From a  b does not follow a  b.
  !
  !  3) Conversions are useless only if with the resulting type
  !   can be used in a subset of the operations the original type
  !   can be applied to.  For example casts to void* are useless,
  !   casts from void* not.  Casts to const T* are useless, casts
  !   from const T* to T* not.
 
 My understanding is that it's a relation, not an operation, so the 
 proper term for 2 would be symmetric.

Correct.

 For the sake of completeness, 
 you could also add that it's reflexive: a  a holds.
 
 Which would suggest to find a better symbol than '' for it. :-)

a # p ?  ;-)


Ciao,
Michael.

Re: How to supress a specific kind of ansi-aliasing rules?

2007-06-20 Thread Mike Stump


On Jun 20, 2007, at 4:57 AM, Bokhanko, Andrey S wrote:
Actually, I'm interested in how to force conservative analysis  
*without*

source code modifications (only with compiler's options).


While we'd recommend using a language called C, you might be able to  
use -O0 or older compilers (3.3 and older I think) with the - 
fvolatile flag.

Re: m68k bootstrap problem

Roman Zippel wrote:
 Hi,

 On Wed, 20 Jun 2007, Kenneth Zadeck wrote:

   
 I don't understand, wouldn't the consertive approach be that the value 
 simply doesn't survive?

   
 No, the conservative is that we do not know anything.  it could be
 destroyed and it could not be destroyed. 
 

 What is the value of this? If we don't know anything, we can't use the 
 value anymore, since it may be destroyed, so effectively we have to 
 assume the value is destroyed.

   
If we add the dead note there we are asserting that the value is
modified by the caller. however it might not be and someone could write
a piece of asm right after the call to use that reg if the person knew
that the reg was not modified by that particular call.

having the dead note there is asserting to the register allocator that
they are free to use that reg after the calll in any way that it wants
and there is a (small) possibility that is wrong.

Re: GCC 4.3.0 Status Report (2007-06-07)

2007-06-20 Thread Michael Meissner

On Tue, Jun 19, 2007 at 08:47:36PM -0700, Mark Mitchell wrote:
 I think we want to avoid making the same mistake we did last time:
 mixing these changes up with LTO.  They will help LTO (by reducing
 memory use), but they're logically independent.  So, if we're not
 comfortable putting the changes on the mainline, they should go on some
 new branch.

See below.

 I agree that introducing the abstractions first, and then switching the
 implementation afterwards, is a good idea.  That's what Sandra did for
 CALL_EXPRs and it worked well.  However:
 
  For the third case, it is fairly simple to switch the code to use
  num_parm_types and nth_parm_types.  This will mean a slight degradation in 
  the
  code that handles arguments (for handling argument n, you need to do n-1
  pointer chases).
 
 I don't think we can do this on mainline.  That's introducing
 quadradicity, and someone will have a 100-argument function, and then
 we'll be sad.  So, I think we need to do something different.  One
 possibility is:
 
   FOR_EACH_ARG_TYPE(fn_type, arg_type)
 {
 }

Yep, though I suspect in practice that the backends is not in performance
critical code.  I think we can replace most of the front/back end usages with
an iterator function.  The front ends need the ability to create/modify the
arguments, while the back ends only need to get the next argument.

I think a gradual approach is the right way.  I think this can be done in the
stage 2 time frame, but it could be pushed to gcc 4.4 (but we will have the
same problem as we have now).  The way I see it, the steps would be:

1) Add the basic infrastructure, iterator macros, stdarg_p, prototype_p,
   etc. to the tree.

2) Change the back ends, 1 by 1 to use the new infrastructure support.

3) Change the front ends, 1 by 1 to use the new infrastructure support.

4) Remove/rename TYPE_ARG_TYPES, and fix any random breakage.

5) Switch the infrastructure underneath to use vectors.

Until #4, you are only changing one thing at a time, and can easily verify that
the change works.

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
[EMAIL PROTECTED]

Re: I'm sorry, but this is unacceptable (union members and ctors)

michael.a wrote:

So, I really appreciate all of your patience in helping to get me through
the build process. I guess I'll post something about how the hacking
effort / reprogramming expiriments work out. In the meantime I hope this
discussion (and the relevance of a proper extension) still has some legs.

I had to knockout this error too:

member with constructor not allowed in anonymous aggregate

GCC is wa too restrictive in these regards.

I've started with building the simplest shared library I have on hand...
there were a few gcc weird behaviors and quirks (you can't 'friend'
typedefs???) ...but everything was eventually ironed out.

Now I'm stuck in the linking phase though...

I setup another thread in gcc-help where I'm having trouble understanding
the logic of the crt.o files, where they come and how they should be used
etc, but no one has payed any mind to it so far:

http://www.nabble.com/building-installing-GCC-and-crtx.o%27s-confusion-tf3949291.html#a11205247

I would appreciate any assistance possible. I'm really banging my head
against the wall at this point.

I should probably just find that Debian patch and install into the system
directories, but I still don't understand if there are any factors outside
of gcc necessary for a successful build (could glibc be related to the crt.o
files -- and are the crt.o files tied to the system, or do they just link to
shared libraries)

sincerely,

michael

PS: please don't just leave me hanging at this point -michael
--
View this message in context:
http://www.nabble.com/I%27m-sorry%2C-but-this-is-unacceptable-%28union-members-and-ctors%29-tf3930964.html#a11216839
Sent from the gcc - Dev mailing list archive at Nabble.com.

Re: m68k bootstrap problem

Hi,

On Wed, 20 Jun 2007, Kenneth Zadeck wrote:

 If we add the dead note there we are asserting that the value is
 modified by the caller. however it might not be and someone could write
 a piece of asm right after the call to use that reg if the person knew
 that the reg was not modified by that particular call.

I have big problems to see this as a valid example, this sounds just 
broken. First off the user had to know the register was alive before and 
then the user had to magically know the register isn't clobbered by the 
call (e.g. loading the address of the function into that register).
You could declare the variable as asm register variable, then it might 
work, but then the register wouldn't be available for allocation anyway 
and the whole problem changes.

 having the dead note there is asserting to the register allocator that
 they are free to use that reg after the calll in any way that it wants
 and there is a (small) possibility that is wrong.

IMO there is nothing wrong with this.

bye, Roman

Re: m68k bootstrap problem

2007-06-20 Thread Paolo Bonzini




having the dead note there is asserting to the register allocator that
they are free to use that reg after the calll in any way that it wants
and there is a (small) possibility that is wrong.


IMO there is nothing wrong with this.


I agree with Roman.  You can always put your call into an asm and make 
it clobber all other caller-save registers (if EH is not a problem etc. 
etc.).


Paolo

Re: m68k bootstrap problem

Roman Zippel wrote:
 Hi,

 On Wed, 20 Jun 2007, Kenneth Zadeck wrote:

   
 If we add the dead note there we are asserting that the value is
 modified by the caller. however it might not be and someone could write
 a piece of asm right after the call to use that reg if the person knew
 that the reg was not modified by that particular call.
 

 I have big problems to see this as a valid example, this sounds just 
 broken. First off the user had to know the register was alive before and 
 then the user had to magically know the register isn't clobbered by the 
 call (e.g. loading the address of the function into that register).
 You could declare the variable as asm register variable, then it might 
 work, but then the register wouldn't be available for allocation anyway 
 and the whole problem changes.

   
 having the dead note there is asserting to the register allocator that
 they are free to use that reg after the calll in any way that it wants
 and there is a (small) possibility that is wrong.
 

 IMO there is nothing wrong with this.

 bye, Roman
   
The whole reason for bugzilla is to remind compiler developers that rare
case cannot be ignored.

kenny

Re: m68k bootstrap problem

Paolo Bonzini wrote:

 having the dead note there is asserting to the register allocator that
 they are free to use that reg after the calll in any way that it wants
 and there is a (small) possibility that is wrong.

 IMO there is nothing wrong with this.

 I agree with Roman.  You can always put your call into an asm and make
 it clobber all other caller-save registers (if EH is not a problem
 etc. etc.).

 Paolo
This is one of the places where i slavishly copied what flow did.  if
you want to change this, go test it on at least 7 platforms and fix all
of the problems that it causes.

Kenny

Re: m68k bootstrap problem

2007-06-20 Thread Paolo Bonzini




This is one of the places where i slavishly copied what flow did.  if
you want to change this, go test it on at least 7 platforms and fix all
of the problems that it causes.


I see. Can one of you recap how it relates to the m68k problem, though? :-)

Paolo

Re: GCC 4.3.0 Status Report (2007-06-07)

2007-06-20 Thread Mark Mitchell

Michael Meissner wrote:

 I think a gradual approach is the right way.  I think this can be done in the
 stage 2 time frame, but it could be pushed to gcc 4.4 (but we will have the
 same problem as we have now).  The way I see it, the steps would be:
 
 1) Add the basic infrastructure, iterator macros, stdarg_p, prototype_p,
etc. to the tree.
 
 2) Change the back ends, 1 by 1 to use the new infrastructure support.
 
 3) Change the front ends, 1 by 1 to use the new infrastructure support.
 
 4) Remove/rename TYPE_ARG_TYPES, and fix any random breakage.
 
 5) Switch the infrastructure underneath to use vectors.
 
 Until #4, you are only changing one thing at a time, and can easily verify 
 that
 the change works.

Yes, I think that this is a good plan.  We can evaluate whether #4 has
to happen on a branch and wait for 4.4 when we get to it.  But, 1, 2, 3
I think are non-controversial, and can go into mainline in real time,
which is nice.

Thanks,

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713

Re: RFC: Make dllimport/dllexport imply default visibility

2007-06-20 Thread Chris Lattner



On Jun 19, 2007, at 7:49 AM, Richard Earnshaw wrote:


On Mon, 2007-06-18 at 10:04 -0700, Mark Mitchell wrote:

I suspect that the realview compiler accepts
this as an oversight or a bug, not as an intentional feature.


Let's ask.

Richard E., is the fact that RealView 3.0SP1 accepts:

  class __declspec(notshared) S {
__declspec(dllimport) void f();
  };

a bug or a feature?  If this is considered a bug, is it something  
that

RealView is likely to change in a future release, or will it be
preserved for the forseeable future for backwards compatibility?


This is well beyond my sphere of expertise, so I've asked one of the
original developers of the spec.  He asserts that the above is  
supported

and intentional.  Hopefully I've correctly represented his position
below.

His key point is that 'notshared' on a class is not the same as making
the whole class hidden: only the class impedimenta (vtables, rtti) is
hidden, but the rest of the class can be exported as normal.  And that
since it can be exported, there's no reason why definitions of member
functions can't be imported.


This description also makes sense, but is different than what was  
described before.  To me, this description/implementation is  
extremely problematic, because the extension cannot be described  
without describing the implementation (specifically presence of  
vtables etc), which is unlike any standard C++ feature.


Some more specific questions:

1. If a class is hidden, does that default all the members (not just  
the metadata) to be notshared?
2. If a class with vtable is hidden, what visibility constraints  
exist on virtual methods?
3. What does 'notshared' on a class without a vtable mean, what  
effect does it have?
4. If classes with vtables have different behavior than those  
without, is this something we want?

5. How does this impact the ODR?

-Chris

Re: m68k bootstrap problem

2007-06-20 Thread Bernd Schmidt


Kenneth Zadeck wrote:

Paolo Bonzini wrote:

having the dead note there is asserting to the register allocator that
they are free to use that reg after the calll in any way that it wants
and there is a (small) possibility that is wrong.

IMO there is nothing wrong with this.

I agree with Roman.  You can always put your call into an asm and make
it clobber all other caller-save registers (if EH is not a problem
etc. etc.).

Paolo

This is one of the places where i slavishly copied what flow did.


I agree with Roman and Paolo that this doesn't sound right.  Where in 
flow did it do this?



Bernd
--
This footer brought to you by insane German lawmakers.
Analog Devices GmbH  Wilhelm-Wagenfeld-Str. 6  80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

Re: m68k bootstrap problem

Bernd Schmidt wrote:
 Kenneth Zadeck wrote:
 Paolo Bonzini wrote:
 having the dead note there is asserting to the register allocator
 that
 they are free to use that reg after the calll in any way that it
 wants
 and there is a (small) possibility that is wrong.
 IMO there is nothing wrong with this.
 I agree with Roman.  You can always put your call into an asm and make
 it clobber all other caller-save registers (if EH is not a problem
 etc. etc.).

 Paolo
 This is one of the places where i slavishly copied what flow did.

 I agree with Roman and Paolo that this doesn't sound right.  Where in
 flow did it do this?


 Bernd
i do not know the code well enough.  i did a/b testing until i got the
same answers.
If you think that this is wrong,  i can easily generate a patch to
change this and we can see if all life as we know it comes to a halt.

I have no real love of this, except the fear of a thousand more bugzillas.

kenny

[tuples] Accessors for RHS of assignments


So, I think I am still not convinced which way we want to access the RHS
of a GS_ASSIGN.

Since GS_ASSIGN can have various types of RHS, we originally had:

gs_assign_unary_rhs (gs)- Access the only operand on RHS
gs_assign_binary_rhs1 (gs)  - Access the 1st RHS operand
gs_assign_binary_rhs2 (gs)  - Access the 2nd RHS operand

And the corresponding _set functions.

I then managed to half convince myself that it'd be better to have a
single gs_assign_rhs() accessor with a 'which' parameter.  After
implementing that, I think I hate it.  Particularly since this 'which'
parameter is just a number (0 or 1).  It could be a mnemonic, but it
would still be annoying.

So, I'm thinking of going back to the way it was before, but it is not
optimal.  Do people feel strongly over one or the other?

Re: RFC: Make dllimport/dllexport imply default visibility

2007-06-20 Thread Mark Mitchell

Chris Lattner wrote:

 This description also makes sense, but is different than what was
 described before.  To me, this description/implementation is extremely
 problematic, because the extension cannot be described without
 describing the implementation (specifically presence of vtables etc),
 which is unlike any standard C++ feature.

That's because it's an ELF-level attribute.  It's like dirty tricks you
can play with the alias attribute (means two functions can have the
same address) or making things weak (means addresses of objects with
static storage duration can be NULL).  These things are all unappealing
from a language theory point of view; they explicitly go behind the back
of the language to do odd things.  This is the ugly reality of C/C++,
and also part of why the languages are so useful.

 Some more specific questions:

I'll answer with what I think should happen and with what I know about
what G++ does.  I'm not sure what RealView does in all of these cases.

 1. If a class is hidden, does that default all the members (not just the
 metadata) to be notshared?

Yes.  (This is what G++ does.)

 2. If a class with vtable is hidden, what visibility constraints exist
 on virtual methods?

None.  In particular, the virtual methods may be imported from another
shared object.  (However, if a virtual function is hidden, then the
vtable must also be defined in the same shared object, as otherwise you
will get a link error.)

 3. What does 'notshared' on a class without a vtable mean, what effect
 does it have?

It means that all the members have hidden visibility, unless otherwise
specified.  (This is what G++ does.)

 4. If classes with vtables have different behavior than those without,
 is this something we want?

There's no difference.

The notshared attribute means that all of the members,
compiler-generated or otherwise, have hidden visibility, unless
otherwise explicitly specified.  It doesn't say anything about the type
per se, just about the various members.

 5. How does this impact the ODR?

It's beyond the scope of the ODR.  Even for two hidden functions f,
each defined in different shared objects, the name of both is ::f --
but is that an ODR violation?  We're way outside the standard at this point.

In practice, people use all of these facilities to do things that
explicitly violate the ODR.  For example, in an implementation DLL:

  struct S {
__attribute__((visibility(hidden))) void f();
__declspec(dllexport) void g();
  };

In other DLLs, the header for S looks like:

  struct S {
__declspec(dllimport) void g();
  };

and f is not even declared.  Certainly, the intent is that these are
the same types, but as a literal reading of the standard that's an ODR
violation.  (Even if you left f in, it would be an ODR violation in a
literal sense due to the change from dllexport to dllimport, unless you
modify the ODR rules to be more structural than literal.  They actually
require token-for-token identity.)

That's why the right way to understand these attributes is purely in
terms of what they do to object files.  Yes, that's crossing an
abstraction barrier, and yes, it means that as a programmer, you need to
understand that there's some vtables and stuff out there, but that's
they way people use these bits in practice.

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713

Re: [tuples] Accessors for RHS of assignments

2007-06-20 Thread Andrew MacLeod

On Wed, 2007-06-20 at 14:19 -0400, Diego Novillo wrote:

 
 gs_assign_unary_rhs (gs)  - Access the only operand on RHS
 gs_assign_binary_rhs1 (gs)- Access the 1st RHS operand
 gs_assign_binary_rhs2 (gs)- Access the 2nd RHS operand
 
 And the corresponding _set functions.
 
 I then managed to half convince myself that it'd be better to have a
 single gs_assign_rhs() accessor with a 'which' parameter.  After
 implementing that, I think I hate it.  Particularly since this 'which'
 parameter is just a number (0 or 1).  It could be a mnemonic, but it
 would still be annoying.
 
 So, I'm thinking of going back to the way it was before, but it is not
 optimal.  Do people feel strongly over one or the other?

I prefer something like the original mechanism, no flags or parameters.
I find that more natural in an interface. flags and parameters are old
school :-)

Andrew

Re: m68k bootstrap problem

Hi,

On Wed, 20 Jun 2007, Paolo Bonzini wrote:

  This is one of the places where i slavishly copied what flow did.  if
  you want to change this, go test it on at least 7 platforms and fix all
  of the problems that it causes.
 
 I see. Can one of you recap how it relates to the m68k problem, though? :-)

I have basically this sequence of instructions:

(call_insn 1 (set (%d0) (call ...)))

(insn 2 (set (x) (%d0)))

(call_insn 3 (set (%a0) (call ...)))

... 

(call_insn/j 5 (set (%a0) (call ...)))

The immediate problem is that there is no REG_DEAD note generated for insn 
2. reload calls the code in caller-save.c, because some other call 
clobbered register is associated with a pseudo and needs to be preserved 
across a call. It builds live information based on the REG_DEAD notes and 
is confused that %d0 doesn't die in insn 3.

There is one problem that the live out information in this block is 
incorrect, as it includes the return register, but the function is exited 
via sibcall and there is no code to set the return register. This is made 
a little more complex, as on m68k the return value here is returned in 
%d0/%a0, so %d0 is live at the end of the block.
One possible fix is to clear the uses at the edge to the exit (what the 
small patch previously posted does).

The second problem (and the one in discussion right now) is that I think 
%d0 should have been clobbered already by any of the later calls, which is 
at least what reload and many other passes assume, they had to manually 
add this information, if they wanted to use the dataflow information.

bye, Roman

Re: m68k bootstrap problem

Roman Zippel wrote:
 Hi,

 On Wed, 20 Jun 2007, Paolo Bonzini wrote:

   
 This is one of the places where i slavishly copied what flow did.  if
 you want to change this, go test it on at least 7 platforms and fix all
 of the problems that it causes.
   
 I see. Can one of you recap how it relates to the m68k problem, though? :-)
 

 I have basically this sequence of instructions:

 (call_insn 1 (set (%d0) (call ...)))

 (insn 2 (set (x) (%d0)))

 (call_insn 3 (set (%a0) (call ...)))

 ... 

 (call_insn/j 5 (set (%a0) (call ...)))

 The immediate problem is that there is no REG_DEAD note generated for insn 
 2. reload calls the code in caller-save.c, because some other call 
 clobbered register is associated with a pseudo and needs to be preserved 
 across a call. It builds live information based on the REG_DEAD notes and 
 is confused that %d0 doesn't die in insn 3.

 There is one problem that the live out information in this block is 
 incorrect, as it includes the return register, but the function is exited 
 via sibcall and there is no code to set the return register. This is made 
 a little more complex, as on m68k the return value here is returned in 
 %d0/%a0, so %d0 is live at the end of the block.
 One possible fix is to clear the uses at the edge to the exit (what the 
 small patch previously posted does).

 The second problem (and the one in discussion right now) is that I think 
 %d0 should have been clobbered already by any of the later calls, which is 
 at least what reload and many other passes assume, they had to manually 
 add this information, if they wanted to use the dataflow information.

 bye, Roman
   
I am going to plead insanity here and throw myself on the mercy of the
court.
there is a bug in generating reg dead notes. 

i will get a fix out soon.

kenny

Re: GCC 4.3.0 Status Report (2007-06-07)

2007-06-20 Thread Michael Meissner

On Wed, Jun 20, 2007 at 10:47:00AM -0700, Mark Mitchell wrote:
 Michael Meissner wrote:
 
  I think a gradual approach is the right way.  I think this can be done in 
  the
  stage 2 time frame, but it could be pushed to gcc 4.4 (but we will have the
  same problem as we have now).  The way I see it, the steps would be:
  
  1) Add the basic infrastructure, iterator macros, stdarg_p, prototype_p,
 etc. to the tree.
  
  2) Change the back ends, 1 by 1 to use the new infrastructure support.
  
  3) Change the front ends, 1 by 1 to use the new infrastructure support.
  
  4) Remove/rename TYPE_ARG_TYPES, and fix any random breakage.
  
  5) Switch the infrastructure underneath to use vectors.
  
  Until #4, you are only changing one thing at a time, and can easily verify 
  that
  the change works.
 
 Yes, I think that this is a good plan.  We can evaluate whether #4 has
 to happen on a branch and wait for 4.4 when we get to it.  But, 1, 2, 3
 I think are non-controversial, and can go into mainline in real time,
 which is nice.
 
 Thanks,

One minor note, one usage that I didn't notice in the first go around is the
rs6000 and spu both have handlers that get called from the function
resolve_overloaded_builtin that takes two argument lists that it needs to
validate which overloaded function to use.  It isn't a major issue, but it is
something that presumably the front ends also do.

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
[EMAIL PROTECTED]

gcc-4.2-20070620 is now available

2007-06-20 Thread gccadmin

Snapshot gcc-4.2-20070620 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20070620/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.2 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch 
revision 125898

You'll find:

gcc-4.2-20070620.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.2-20070620.tar.bz2 C front end and core compiler

gcc-ada-4.2-20070620.tar.bz2  Ada front end and runtime

gcc-fortran-4.2-20070620.tar.bz2  Fortran front end and runtime

gcc-g++-4.2-20070620.tar.bz2  C++ front end and runtime

gcc-java-4.2-20070620.tar.bz2 Java front end and runtime

gcc-objc-4.2-20070620.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.2-20070620.tar.bz2The GCC testsuite

Diffs from 4.2-20070613 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.2
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

Re: [tuples] Accessors for RHS of assignments

2007-06-20 Thread Zdenek Dvorak

Hello,

 So, I think I am still not convinced which way we want to access the RHS
 of a GS_ASSIGN.
 
 Since GS_ASSIGN can have various types of RHS, we originally had:
 
 gs_assign_unary_rhs (gs)  - Access the only operand on RHS
 gs_assign_binary_rhs1 (gs)- Access the 1st RHS operand
 gs_assign_binary_rhs2 (gs)- Access the 2nd RHS operand
 
 And the corresponding _set functions.
 
 I then managed to half convince myself that it'd be better to have a
 single gs_assign_rhs() accessor with a 'which' parameter.  After
 implementing that, I think I hate it.  Particularly since this 'which'
 parameter is just a number (0 or 1).  It could be a mnemonic, but it
 would still be annoying.
 
 So, I'm thinking of going back to the way it was before, but it is not
 optimal.  Do people feel strongly over one or the other?

I may be missing something, but surely having the accessors uniform
would be better?  So that I can write things like

/* Process all operands.  */
for (i = 0; i  n_operands (gs); i++)
  process (gs_assign_rhs (gs, i));

rather than

if (is_unary (gs))
  process (gs_assign_unary_rhs (gs));
else if (is_binary (gs))
  {
process (gs_assign_binary_rhs1 (gs));
process (gs_assign_binary_rhs2 (gs));
  }
else if (is_ternary (gs))
  ...

Anyway, you can always

#define gs_assign_unary_rhs(X) gs_assign_rhs(X, 0)
#define gs_assign_binary_rhs1(X) gs_assign_rhs(X, 0)
#define gs_assign_binary_rhs2(X) gs_assign_rhs(X, 1)

as well, and use these in cases where you know with which arity you are
working.

Zdenek

Re: libgcc fails to compile if DItype is not supported [bswapdi2]

2007-06-20 Thread Eric Christopher



On Jun 19, 2007, at 10:50 PM, Pompapathi V Gadad wrote:


Hello,
Current function declaration of __bswapdi2 in libgcc2.h is:
DItype __bswapdi2 (DItype u)

Since this declaration does not check if DItype is supported, it is  
bound for compilation failure for targets that do not support  
DItype. Would it be ok to change the DItype to DWtype as in:


DWtype __bswapdi2 (DWtype u)

Is the above declaration more safer for all targets?


No, probably best would be a check to make sure DItype is defined. In  
this case it's not a double word operation it is explicitly a 64-bit  
operation.


-eric

Re: [tuples] Accessors for RHS of assignments

On 6/20/07 7:13 PM, Zdenek Dvorak wrote:

 I may be missing something, but surely having the accessors uniform
 would be better?  So that I can write things like
 
 /* Process all operands.  */
 for (i = 0; i  n_operands (gs); i++)
   process (gs_assign_rhs (gs, i));

Yeah, we already have that.  It's just that 'i' is never  3.  operand 0
is the LHS, operand 1 and operand 2 are the RHS.  My issue was with the
functions that are more commonly invoked to deal with each RHS operand.

Re: I'm sorry, but this is unacceptable (union members and ctors)

michael.a wrote:

I should probably just find that Debian patch and install into the system
directories, but I still don't understand if there are any factors outside
of gcc necessary for a successful build (could glibc be related to the
crt.o files -- and are the crt.o files tied to the system, or do they just
link to shared libraries)

I think I have maybe the proper debian patches for gcc from here:

http://archive.ubuntu.com/ubuntu/pool/main/g/gcc-defaults/gcc-defaults_1.32.tar.gz

But even if so, I have no idea what to do with it. Lots of files that look
like preinst postinst prerm postrm... of course the README file provided is
absolutely zero help, nor is it particularly possible to not get a swamp of
junk responses from google. Debian doesn't appear to me to actually provide
GCC sources from the repository, but does provide Debian patches.

I've never understood why linux has to be so damn opaque about
documentation... makes for a lot of unnecessary help requests.

Any advice please? I'd basicly like to be able to rebuild a proper Debian
compliant GCC, preferably without knocking on more than a handful of mailing
lists in the process.

If I'm nagging at this point please let me know... I'd just like be able to
add to the Linux community. If only it wasn't always so trying to figure out
what the hell is going on.

sincerely,

michael
--
View this message in context:
http://www.nabble.com/I%27m-sorry%2C-but-this-is-unacceptable-%28union-members-and-ctors%29-tf3930964.html#a11225709
Sent from the gcc - Dev mailing list archive at Nabble.com.

libjava is broken

2007-06-20 Thread H. J. Lu

This patch

http://gcc.gnu.org/ml/java-patches/2007-q2/msg00322.html

breaks libjava build:

/net/gnu-13/export/gnu/src/gcc/gcc/libjava/gnu/classpath/jdwp/natVMVirtualMachine.cc:700:
error: prototype for 'gnu::classpath::jdwp::util::MethodResult*
gnu::classpath::jdwp::VMVirtualMachine::executeMethod(java::lang::Object*,
java::lang::Thread*, java::lang::Class*,
gnu::classpath::jdwp::VMMethod*,
JArraygnu::classpath::jdwp::value::Value**, jint)' does not match any
in class 'gnu::classpath::jdwp::VMVirtualMachine'
/net/gnu-13/export/gnu/src/gcc/gcc/libjava/gnu/classpath/jdwp/VMVirtualMachine.h:57:
error: candidate is: static gnu::classpath::jdwp::util::MethodResult*
gnu::classpath::jdwp::VMVirtualMachine::executeMethod(java::lang::Object*,
java::lang::Thread*, java::lang::Class*, java::lang::reflect::Method*,
JArrayjava::lang::Object**, jboolean)
make[7]: *** [gnu/classpath/jdwp/natVMVirtualMachine.lo] Error 1

Keith, did you forget to update VMVirtualMachine.h?


H.J.

Re: I'm sorry, but this is unacceptable (union members and ctors)

2007-06-20 Thread CaT

On Wed, Jun 20, 2007 at 07:31:44PM -0700, michael.a wrote:
 michael.a wrote:
  I should probably just find that Debian patch and install into the system
  directories, but I still don't understand if there are any factors outside
  of gcc necessary for a successful build (could glibc be related to the
  crt.o files -- and are the crt.o files tied to the system, or do they just
  link to shared libraries)
  
 
 I think I have maybe the proper debian patches for gcc from here:
 
 http://archive.ubuntu.com/ubuntu/pool/main/g/gcc-defaults/gcc-defaults_1.32.tar.gz

That's Ubuntu, not debian. Similar, yet different.

 But even if so, I have no idea what to do with it. Lots of files that look
 like preinst postinst prerm postrm... of course the README file provided is
 absolutely zero help, nor is it particularly possible to not get a swamp of

*looks* Looks helpful to me:

The Debian GNU Compiler Collection Setup


Abstract


Debian uses a default version of GCC for most packages; however, some
packages require another version.  So, Debian allows several versions
of GCC to coexist on the same system, and selects the default version
by means of the gcc-defaults package, which creates symbolic links as
appropriate.

Versions of GCC present in Debian Etch
--
...

The default compiler versions for Debian GNU/Linux on i386 are
(minor version numbers omitted):

cpp : cpp-4.1
gcc : gcc-4.1
g++ : g++-4.1
...

 junk responses from google. Debian doesn't appear to me to actually provide
 GCC sources from the repository, but does provide Debian patches.

Sure it does. Helps to get the right package:

$ apt-get source gcc-4.1 
...
$ ls -lad gcc*
4 drwxr-xr-x 3 root root 4096 2007-06-21 12:35 gcc-4.1-4.1.1ds2
 6956 -rw--- 1 root root  7109677 2006-12-11 06:02 
gcc-4.1_4.1.1ds2-21.diff.gz
4 -rw--- 1 root root 2407 2006-12-11 06:02 gcc-4.1_4.1.1ds2-21.dsc
36156 -rw--- 1 root root 36982690 2006-10-21 23:17 
gcc-4.1_4.1.1ds2.orig.tar.gz

Which gives us:

gcc-4.1_4.1.1ds2.orig.tar.gz- the original source
gcc-4.1_4.1.1ds2-21.dsc - a description of the package
gcc-4.1_4.1.1ds2-21.diff.gz - debians modifications to the original

 I've never understood why linux has to be so damn opaque about
 documentation... makes for a lot of unnecessary help requests.

Helps to read the right documentation. Start with the stuff on its
package management tools and move on from there.

-- 
To the extent that we overreact, we proffer the terrorists the
greatest tribute.
- High Court Judge Michael Kirby

Re: libjava is broken

2007-06-20 Thread David Daney


H. J. Lu wrote:

This patch

http://gcc.gnu.org/ml/java-patches/2007-q2/msg00322.html

breaks libjava build:

/net/gnu-13/export/gnu/src/gcc/gcc/libjava/gnu/classpath/jdwp/natVMVirtualMachine.cc:700:
error: prototype for 'gnu::classpath::jdwp::util::MethodResult*
gnu::classpath::jdwp::VMVirtualMachine::executeMethod(java::lang::Object*,
java::lang::Thread*, java::lang::Class*,
gnu::classpath::jdwp::VMMethod*,
JArraygnu::classpath::jdwp::value::Value**, jint)' does not match any
in class 'gnu::classpath::jdwp::VMVirtualMachine'
/net/gnu-13/export/gnu/src/gcc/gcc/libjava/gnu/classpath/jdwp/VMVirtualMachine.h:57:
error: candidate is: static gnu::classpath::jdwp::util::MethodResult*
gnu::classpath::jdwp::VMVirtualMachine::executeMethod(java::lang::Object*,
java::lang::Thread*, java::lang::Class*, java::lang::reflect::Method*,
JArrayjava::lang::Object**, jboolean)
make[7]: *** [gnu/classpath/jdwp/natVMVirtualMachine.lo] Error 1

Keith, did you forget to update VMVirtualMachine.h?

  


It would appear so.

If you are in a hurry and have the requisite tools installed, configure 
with --java-maintainer-mode  to regenerate the file yourself.


David Daney

H.J.

Re: libgcc fails to compile if DItype is not supported [bswapdi2]

2007-06-20 Thread Pompapathi V Gadad


Hello Eric,
The target I am working on is 16-bit target and cannot support 64-bit 
data types (DI mode).


How about conditionally declare the function?
#if LONG_LONG_TYPE_SIZE  32
extern DItype __bswapdi2 (DItype);
#endif

Thanks,
Pompa

Eric Christopher wrote:


On Jun 19, 2007, at 10:50 PM, Pompapathi V Gadad wrote:


Hello,
Current function declaration of __bswapdi2 in libgcc2.h is:
DItype __bswapdi2 (DItype u)

Since this declaration does not check if DItype is supported, it is 
bound for compilation failure for targets that do not support DItype. 
Would it be ok to change the DItype to DWtype as in:


DWtype __bswapdi2 (DWtype u)

Is the above declaration more safer for all targets?


No, probably best would be a check to make sure DItype is defined. In 
this case it's not a double word operation it is explicitly a 64-bit 
operation.


-eric

Re: I'm sorry, but this is unacceptable (union members and ctors)




Cat-4 wrote:
 
 $ ls -lad gcc*
 4 drwxr-xr-x 3 root root 4096 2007-06-21 12:35 gcc-4.1-4.1.1ds2
  6956 -rw--- 1 root root  7109677 2006-12-11 06:02
 gcc-4.1_4.1.1ds2-21.diff.gz
 4 -rw--- 1 root root 2407 2006-12-11 06:02
 gcc-4.1_4.1.1ds2-21.dsc
 36156 -rw--- 1 root root 36982690 2006-10-21 23:17
 gcc-4.1_4.1.1ds2.orig.tar.gz
 
 Which gives us:
 
 gcc-4.1_4.1.1ds2.orig.tar.gz  - the original source
 gcc-4.1_4.1.1ds2-21.dsc   - a description of the package
 gcc-4.1_4.1.1ds2-21.diff.gz   - debians modifications to the original
 

I get something like this. I assumed that the patches were preapplied, or do
you have to patch the diff file to the orig extracted archive?

I ask, because I'm getting this error again:

/usr/bin/ld: skipping incompatible /usr/lib/../lib/libc.so when searching
for -lc
/usr/bin/ld: skipping incompatible /usr/lib/../lib/libc.a when searching for
-lc
/usr/bin/ld: skipping incompatible /usr/bin/../lib/libc.so when searching
for -lc
/usr/bin/ld: skipping incompatible /usr/bin/../lib/libc.a when searching for
-lc
/usr/bin/ld: skipping incompatible /usr/lib/libc.so when searching for -lc
/usr/bin/ld: skipping incompatible /usr/lib/libc.a when searching for -lc
/usr/bin/ld: cannot find -lc
collect2: ld returned 1 exit status

Which I believe disappeared before when using the --disable-multilib option,
but the main reason I'm building this way is so I don't have to use that
flag (my impression is using that flag disables the ability to compile
different architectures/platforms on the local machine -- I assumed one of
the patches would rectify this matter, or is the previously mentioned patch
not used by [some] debian distros?) 

If nothing else, if someone can answer, I would like a confirmation that the
provided packages should be preapplied to the source package.

I guess in the meantime I'll go ahead and install it and see if I can use it
or not. I'd appreciate it if someone can point me to the Debian amd64
multilib patch that was said to exist if it shouldn't be included in the
repository patches.
-- 
View this message in context: 
http://www.nabble.com/I%27m-sorry%2C-but-this-is-unacceptable-%28union-members-and-ctors%29-tf3930964.html#a11226604
Sent from the gcc - Dev mailing list archive at Nabble.com.

Re: I'm sorry, but this is unacceptable (union members and ctors)




michael.a wrote:
 
 I guess in the meantime I'll go ahead and install it and see if I can use
 it or not.
 

Success! 

Will likely be a good while before I can report whether simply knocking out
the errors cause any run-time issues.

In the meantime, if anyone can clue me in on squaring multilib support with
amd64 debian, I'd appreciate it very much :)

-- 
View this message in context: 
http://www.nabble.com/I%27m-sorry%2C-but-this-is-unacceptable-%28union-members-and-ctors%29-tf3930964.html#a11226865
Sent from the gcc - Dev mailing list archive at Nabble.com.

[Bug middle-end/32285] [4.1 Regression] Miscompilation with pure _Complex returning call inside another fn's argument list

--- Comment #10 from jakub at gcc dot gnu dot org  2007-06-20 06:36 ---
Subject: Bug 32285

Author: jakub
Date: Wed Jun 20 06:35:55 2007
New Revision: 125873

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=125873
Log:
PR middle-end/32285
* calls.c (precompute_arguments): Also precompute CALL_EXPR arguments
if ACCUMULATE_OUTGOING_ARGS.

* gcc.c-torture/execute/20070614-1.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/execute/20070614-1.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/calls.c
trunk/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32285

[Bug inline-asm/32109] [4.1/4.2/4.3 regression] ICE with inline-asm and class with destructor

--- Comment #2 from jakub at gcc dot gnu dot org  2007-06-20 06:37 ---
Subject: Bug 32109

Author: jakub
Date: Wed Jun 20 06:37:17 2007
New Revision: 125874

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=125874
Log:
PR inline-asm/32109
* gimplify.c (gimplify_asm_expr): Issue error if type is addressable
and !allows_mem.

* g++.dg/ext/asm10.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/ext/asm10.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/gimplify.c
trunk/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32109

[Bug middle-end/31959] [4.3 Regression] ICE in expand_builtin_expect, at builtins.c:5112

--- Comment #4 from jakub at gcc dot gnu dot org  2007-06-20 06:40 ---
Subject: Bug 31959

Author: jakub
Date: Wed Jun 20 06:39:53 2007
New Revision: 125875

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=125875
Log:
PR middle-end/31959
* builtins.c: Include diagnostic.h.
(expand_builtin_expect): Make gcc_assert more permissive.
* Makefile.in (builtins.o): Depend on $(DIAGNOSTIC_H).

* gcc.dg/pr31959.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr31959.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/Makefile.in
trunk/gcc/builtins.c
trunk/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31959

[Bug middle-end/32285] [4.1 Regression] Miscompilation with pure _Complex returning call inside another fn's argument list

--- Comment #11 from jakub at gcc dot gnu dot org  2007-06-20 06:44 ---
Subject: Bug 32285

Author: jakub
Date: Wed Jun 20 06:44:26 2007
New Revision: 125877

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=125877
Log:
PR middle-end/32285
* calls.c (precompute_arguments): Also precompute CALL_EXPR arguments
if ACCUMULATE_OUTGOING_ARGS.

* gcc.c-torture/execute/20070614-1.c: New test.

Added:
branches/gcc-4_2-branch/gcc/testsuite/gcc.c-torture/execute/20070614-1.c
Modified:
branches/gcc-4_2-branch/gcc/ChangeLog
branches/gcc-4_2-branch/gcc/calls.c
branches/gcc-4_2-branch/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32285

[Bug inline-asm/32109] [4.1/4.2/4.3 regression] ICE with inline-asm and class with destructor

--- Comment #3 from jakub at gcc dot gnu dot org  2007-06-20 06:46 ---
Subject: Bug 32109

Author: jakub
Date: Wed Jun 20 06:46:31 2007
New Revision: 125878

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=125878
Log:
PR inline-asm/32109
* gimplify.c (gimplify_asm_expr): Issue error if type is addressable
and !allows_mem.

* g++.dg/ext/asm10.C: New test.

Added:
branches/gcc-4_2-branch/gcc/testsuite/g++.dg/ext/asm10.C
Modified:
branches/gcc-4_2-branch/gcc/ChangeLog
branches/gcc-4_2-branch/gcc/gimplify.c
branches/gcc-4_2-branch/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32109

[Bug middle-end/32285] [4.1 Regression] Miscompilation with pure _Complex returning call inside another fn's argument list

--- Comment #12 from jakub at gcc dot gnu dot org  2007-06-20 06:50 ---
Subject: Bug 32285

Author: jakub
Date: Wed Jun 20 06:50:23 2007
New Revision: 125879

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=125879
Log:
PR middle-end/32285
* calls.c (precompute_arguments): Also precompute CALL_EXPR arguments
if ACCUMULATE_OUTGOING_ARGS.

* gcc.c-torture/execute/20070614-1.c: New test.

Added:
branches/gcc-4_1-branch/gcc/testsuite/gcc.c-torture/execute/20070614-1.c
Modified:
branches/gcc-4_1-branch/gcc/ChangeLog
branches/gcc-4_1-branch/gcc/calls.c
branches/gcc-4_1-branch/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32285

[Bug inline-asm/32109] [4.1/4.2/4.3 regression] ICE with inline-asm and class with destructor

--- Comment #4 from jakub at gcc dot gnu dot org  2007-06-20 06:51 ---
Subject: Bug 32109

Author: jakub
Date: Wed Jun 20 06:51:47 2007
New Revision: 125880

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=125880
Log:
PR inline-asm/32109
* gimplify.c (gimplify_asm_expr): Issue error if type is addressable
and !allows_mem.

* g++.dg/ext/asm10.C: New test.

Added:
branches/gcc-4_1-branch/gcc/testsuite/g++.dg/ext/asm10.C
Modified:
branches/gcc-4_1-branch/gcc/ChangeLog
branches/gcc-4_1-branch/gcc/gimplify.c
branches/gcc-4_1-branch/gcc/testsuite/ChangeLog

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32109

[Bug rtl-optimization/32405] assertion failure in loop-iv.c; probable dataflow regression

2007-06-20 Thread rakdver at gcc dot gnu dot org

--- Comment #2 from rakdver at gcc dot gnu dot org  2007-06-20 06:56 ---
Subject: Bug 32405

Author: rakdver
Date: Wed Jun 20 06:56:26 2007
New Revision: 125881

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=125881
Log:
PR rtl-optimization/32405
* loop-iv.c (iv_get_reaching_def): Fail for partial defs.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/loop-iv.c

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32405

[Bug rtl-optimization/32405] assertion failure in loop-iv.c; probable dataflow regression

2007-06-20 Thread rakdver at gcc dot gnu dot org



--- Comment #3 from rakdver at gcc dot gnu dot org  2007-06-20 06:57 ---
Should be fixed now.


-- 

rakdver at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32405

[Bug target/32406] [4.3 Regression] MIPS: FAIL in nestfunc-6.c at -O3

2007-06-20 Thread daney at gcc dot gnu dot org



--- Comment #2 from daney at gcc dot gnu dot org  2007-06-20 07:11 ---
Created an attachment (id=13739)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13739action=view)
First patch attempt.

I think this patch fixes this bug.  The test case looks better from my
cross-compiler.  I will bootstrap it to be sure.

I don't really like the patch that much though.  It forces $gp to be loaded in
a nonlocal_goto_receiver, which fixes the bug in cases where $gp is needed.  If
$gp is not needed, it would be nice not to force it to be restored.

In vain I tried to mark $gp as clobbered in hope that it would be magically
restored if needed.  I guess I need a bit more RTL foo.  If there are two
function calls in the nonlocal goto target (two uses of $gp with a clobber
between), the second call has $gp restored.  I think there should be a way to
make the first use of $gp to cause $gp to be restored, but I don't know what it
is.

Thanks to Hans-Peter Nilsson for the pointer.


-- 

daney at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |daney at gcc dot gnu dot org
   |dot org |
 Status|UNCONFIRMED |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32406

[Bug bootstrap/32272] make exit because build/genmodes.exe doesn't exist

2007-06-20 Thread boris at phidani dot be



--- Comment #1 from boris at phidani dot be  2007-06-20 07:18 ---
got same problem with gcc 4.2.0 on suse linux 9.0:

I did:
 /opt/gcc-4.2.0/configure --program-suffix=-4.2.0

then:
 make CFLAGS='-O' LIBCFLAGS='-g -O2' LIBCXXFLAGS='-g -O2
-fno-implicit-templates' bootstrap

Here is the last lines of the output:
snip
ar cru libdecnumber.a decNumber.o decContext.o decUtility.o decimal32.o
decimal64.o decimal128.o
ranlib libdecnumber.a
make[3]: Leaving directory `/opt/gcc-4.2.0-obj/libdecnumber'
make[3]: Entering directory `/opt/gcc-4.2.0-obj/gcc'
TARGET_CPU_DEFAULT= \
HEADERS=auto-host.h ansidecl.h DEFINES= \
/bin/sh /opt/gcc-4.2.0/gcc/mkconfig.sh config.h
TARGET_CPU_DEFAULT= \
HEADERS=options.h config/i386/i386.h config/i386/unix.h config/i386/att.h
config/dbxelf.h config/elfos.h config/svr4.h config/linux.h config/i386/linux.h
defaults.h DEFINES=UCLIBC_DEFAULT=0 \
/bin/sh /opt/gcc-4.2.0/gcc/mkconfig.sh tm.h
gawk -f /opt/gcc-4.2.0/gcc/opt-gather.awk /opt/gcc-4.2.0/gcc/treelang/lang.opt
/opt/gcc-4.2.0/gcc/c.opt /opt/gcc-4.2.0/gcc/common.opt
/opt/gcc-4.2.0/gcc/config/i386/i386.opt /opt/gcc-4.2.0/gcc/config/linux.opt 
tmp-optionlist
/bin/sh /opt/gcc-4.2.0/gcc/../move-if-change tmp-optionlist optionlist
echo timestamp  s-options
gawk -f /opt/gcc-4.2.0/gcc/opt-functions.awk -f /opt/gcc-4.2.0/gcc/opth-gen.awk
\
optionlist  tmp-options.h
/bin/sh /opt/gcc-4.2.0/gcc/../move-if-change tmp-options.h options.h
echo timestamp  s-options-h
TARGET_CPU_DEFAULT= \
HEADERS=auto-host.h ansidecl.h DEFINES= \
/bin/sh /opt/gcc-4.2.0/gcc/mkconfig.sh bconfig.h
gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings
-Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition
-Wmissing-format-attribute -fno-common   -DHAVE_CONFIG_H -DGENERATOR_FILE -I.
-Ibuild -I/opt/gcc-4.2.0/gcc -I/opt/gcc-4.2.0/gcc/build
-I/opt/gcc-4.2.0/gcc/../include -I/opt/gcc-4.2.0/gcc/../libcpp/include 
-I/opt/gcc-4.2.0/gcc/../libdecnumber -I../libdecnumber-o build/errors.o
/opt/gcc-4.2.0/gcc/errors.c
build/genmodes -h  tmp-modes.h
/bin/sh: line 1: build/genmodes: No such file or directory
make[3]: *** [s-modes-h] Error 127
make[3]: Leaving directory `/opt/gcc-4.2.0-obj/gcc'
make[2]: *** [all-stage1-gcc] Error 2
make[2]: Leaving directory `/opt/gcc-4.2.0-obj'
make[1]: *** [stage1-bubble] Error 2
make[1]: Leaving directory `/opt/gcc-4.2.0-obj'
make: *** [bootstrap] Error 2


-- 

boris at phidani dot be changed:

   What|Removed |Added

 CC||boris at phidani dot be


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32272

[Bug bootstrap/32024] ICE - libgcc2.c:557: internal compiler error: in fold_checksum_tree, at fold-const.c:12652



--- Comment #10 from ubizjak at gmail dot com  2007-06-20 08:23 ---
Confirmed, configure gcc with --enable=checking=fold

--cut here--
typedef union
{
  struct {int low, high;} s;
  long long ll;
} DWunion;

long long
__muldi3 (long long u, long long v)
{
  const DWunion uu = {.ll = u};
  const DWunion vv = {.ll = v};
  DWunion w = {.ll = 0 };

  w.s.high += ((unsigned int) uu.s.low * (unsigned int) vv.s.high
+ (unsigned int) uu.s.high * (unsigned int) vv.s.low);

  return w.ll;
}
--cut here--

gcc -O2:

mul.c: In function '__muldi3':
mul.c:9: internal compiler error: in fold_checksum_tree, at fold-const.c:12775
Please submit a full bug report,


-- 

ubizjak at gmail dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Component|middle-end  |bootstrap
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2007-06-20 08:23:06
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32024

[Bug c++/32412] New: Passing struct as parameter breaks SRA for stack-allocated struct inside called function

2007-06-20 Thread scovich at gmail dot com

sra-bug.C (below) contains a function which stack-allocates a local struct
containing two small arrays. The function depends on SRA to eliminate repeated
memory accesses to the two arrays as it streams over a large, third array.

The performance of the executables resulting from
g++ -Wall -O3 -msse3 -fpeel-loops sra-bug.C
and
g++ -Wall -O3 -msse3 -fpeel-loops sra-bug.C -DTRIGGER_BUG
differs by exactly 2x on my machine (a 2.66GHz Core2 quad Xeon), with the
runtime increasing from .395 ns/value/entry to .790 ns/value/entry. 

The only difference between the two versions is whether the array pointer and
count are passed as separate arguments (fast) or wrapped in a struct (slow),
even though the latter gets copied into local variables before use. Use of the
__restrict keyword didn't seem to make a difference. The assembler output shows
that excessive loads and stores nearly double the instruction count of the
unrolled inner loop for the slower case.

FYI gcc-4.2.0 shows similar behavior, though its output is slower than 4.1 for
both cases (.420ns vs 1.10ns). gcc-4.3-20070617 performs equally badly on both
versions of the code (.690 ns/value/entry). 

sra-bug.C:
===
#include emmintrin.h
#include stdint.h
#include cassert
#include cstdio
#include sys/time.h

struct stopwatch_t {
struct timeval tv; long long mark;
stopwatch_t() { reset(); }
double time_ns() {
long long old_mark = mark; reset(); return 1e3*(mark - old_mark);
}
void reset() {
gettimeofday(tv, NULL); mark = tv.tv_usec + tv.tv_sec*100ll;
}
};

templateint N, class T, class Action
inline void unrolled_loop(T* entries, Action action) {
  for(int i=0; i  N; i++) action(entries[i]);
}

static __m128i const ALL_ZEROS = {0ull, 0ull};
static __m128i const ALL_ONES = {~0ull, ~0ull};
static int const COUNT=4;

struct Action16 {
  __m128i _results[COUNT];
  __m128i _values[COUNT];
  __m128i* _dest;
  Action16(__m128i* dest, uint64_t const* values) : _dest(dest) {
for(int i=0; i  COUNT; i++) {
  _results[i] = ALL_ZEROS;
  _values[i] = _mm_set1_epi16((short) values[i]);
}
  }
  void operator()(__m128i const entry) {
for(int i=0; i  COUNT; i++)
  _results[i] |= _mm_cmpeq_epi16(_values[i], entry);
  }
  ~Action16() {
for(int i=0; i  COUNT; i++)
  _dest[i] = _mm_movemask_epi8(_results[i])? ALL_ONES : ALL_ZEROS;
  }
};

struct wrapper {
  __m128i const* entries;
  int count;
};

#ifdef TRIGGER_BUG
void foo(__m128i* dest, uint64_t const* values, wrapper const w) {
  __m128i const* entries = w.entries;  int count = w.count;
#else
void foo(__m128i* dest, uint64_t const* values, __m128i const* entries, int
coun
t) {
#endif
  static int const unroll_count=16;
  Action16 action(dest, values);
  assert((count % unroll_count) == 0);
  for(int i=0; i+unroll_count  count; i+=unroll_count)
unrolled_loopunroll_count(entries[i], action);
}

int main() {
  int VALUE_COUNT = 100;
  int LIST_SIZE = 2048;
  uint64_t* values = new uint64_t[VALUE_COUNT];
  __m128i* dest = (__m128i*) _mm_malloc(16*VALUE_COUNT, 16);
  __m128i entries[LIST_SIZE];
  wrapper w = {entries, LIST_SIZE};
  stopwatch_t timer;
  for(int j=0; j  5; j++) {
for(int i=0; i  VALUE_COUNT; i+= COUNT) {
#ifdef TRIGGER_BUG
  foo(dest+i, values+i, w);
#else
  foo(dest+i, values+i, entries, LIST_SIZE);
#endif
}
printf(%.3lf ns/value/entry\n, timer.time_ns()/LIST_SIZE/VALUE_COUNT);
  }
}


-- 
   Summary: Passing struct as parameter breaks SRA for stack-
allocated struct inside called function
   Product: gcc
   Version: 4.1.2
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: scovich at gmail dot com
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32412

[Bug c++/32413] New: [4.3 Regression] internal compiler error: in reload_cse_simplify_operands, at postreload.c:396

2007-06-20 Thread jojelino at gmail dot com

svn revision:125876
cc -c -mno-cygwin -mdll -fno-rtti -mthreads -pipe -D_WINGDI_ -DUCLIBCPP
-D_GLIBC
PP_HAVE_MBSTATE_T -D_WIN32_IE=0x0500 -msse4a -DARCH_IS_IA32 -DARCH_IS_32BIT
-DHA
VE_MMX -w -DNDEBUG -UDEBUG -DFFDEBUG=0 -I. -I.. -Iuclibc++ -Ibaseclasses
-I../ba
seclasses -IimgFilters -I../imgFilters -Implayer -I../mplayer -Isettings
-I../se
ttings -Isettings/filters -I../settings/filters -Icodecs -I../codecs
-Isubtitles
 -I../subtitles -Iconvert -I../convert -Idialog -I../dialog -IaudioFilters
-I../
audioFilters -Icygwin -I../cygwin -Iffmpeg -I../ffmpeg -Iacm -I../acm -Ixiph
-I.
./xiph -Ifilters -I../filters -Imuxers -I../muxers -O4 -march=core2
-mtune=core2
 -fomit-frame-pointer -finline-functions -finline -frename-registers -fweb
-funi
t-at-a-time -o ffdshow_all.o ffdshow_all.cpp

In file included from ffdshow_all.cpp:37:
DeCSSInputPin.cpp: In member function 'virtual long int
CDeCSSInputPin::Set(cons
t GUID, ULONG, void*, ULONG, void*, ULONG)':
DeCSSInputPin.cpp:254: error: insn does not satisfy its constraints:
(insn 1194 592 593 19 CSSscramble.cpp:174 (set (reg:QI 22 xmm1 [orig:545
t5.7036
1 ] [545])
(reg:QI 0 ax)) 43 {*movqi_1} (nil))
DeCSSInputPin.cpp:254: internal compiler error: in
reload_cse_simplify_operands,
 at postreload.c:396
Please submit a full bug report,
with preprocessed source if appropriate.
See URL:http://gcc.gnu.org/bugs.html for instructions.
make: *** [ffdshow_all.o] Error 1
cc: warning: -pipe ignored because -save-temps specified
In file included from ffdshow_all.cpp:38:
DeCSSInputPin.cpp: In member function 'virtual long int
CDeCSSInputPin::Set(cons
t GUID, ULONG, void*, ULONG, void*, ULONG)':
DeCSSInputPin.cpp:254: error: insn does not satisfy its constraints:
(insn 1194 592 593 19 CSSscramble.cpp:174 (set (reg:QI 22 xmm1 [orig:545
t5.7036
1 ] [545])
(reg:QI 0 ax)) 43 {*movqi_1} (nil))
DeCSSInputPin.cpp:254: internal compiler error: in
reload_cse_simplify_operands,
 at postreload.c:396
Please submit a full bug report,
with preprocessed source if appropriate.
See URL:http://gcc.gnu.org/bugs.html for instructions.


-- 
   Summary: [4.3 Regression] internal compiler error: in
reload_cse_simplify_operands, at postreload.c:396
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jojelino at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32413

[Bug c++/32413] [4.3 Regression] internal compiler error: in reload_cse_simplify_operands, at postreload.c:396

2007-06-20 Thread jojelino at gmail dot com



--- Comment #1 from jojelino at gmail dot com  2007-06-20 08:39 ---
Created an attachment (id=13740)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13740action=view)
preprocessed file


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32413

[Bug fortran/32140] [4.3 Regression] Miscompilation with -O1

2007-06-20 Thread jv244 at cam dot ac dot uk



--- Comment #19 from jv244 at cam dot ac dot uk  2007-06-20 08:48 ---
(In reply to comment #17)
 Here is the fix which I am testing, basically instead of creating
 (typeof(array[0] *)array we create array[lb]:

did this fix test OK ? Since it fixes the CP2K issue, I would hope that it
could be posted for review soon.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32140

[Bug fortran/32140] [4.3 Regression] Miscompilation with -O1

2007-06-20 Thread fxcoudert at gcc dot gnu dot org



--- Comment #20 from fxcoudert at gcc dot gnu dot org  2007-06-20 08:52 
---
(In reply to comment #19)
 did this fix test OK ? Since it fixes the CP2K issue, I would hope that it
 could be posted for review soon.

The patch is OK to commit along with a testcase from this PR.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32140

[Bug bootstrap/32024] ICE - libgcc2.c:557: internal compiler error: in fold_checksum_tree, at fold-const.c:12652



--- Comment #11 from ubizjak at gmail dot com  2007-06-20 08:55 ---
backtrace:

(gdb) bt
#0  fancy_abort (file=0x8a02980 ../../gcc-svn/trunk/gcc/fold-const.c,
line=12775, function=0x8a021be fold_checksum_tree) at
../../gcc-svn/trunk/gcc/diagnostic.c:656
#1  0x08207fc1 in fold_checksum_tree (expr=0xb7f55de4, ctx=0xbfab4df0,
ht=0x92ce1e8) at ../../gcc-svn/trunk/gcc/fold-const.c:12775
#2  0x082077bf in fold_checksum_tree (expr=0xb7f5b138, ctx=0xbfab4df0,
ht=0x92ce1e8) at ../../gcc-svn/trunk/gcc/fold-const.c:12779
#3  0x0823b69e in fold_build1_stat (code=NOP_EXPR, type=0xb7eb5360,
op0=0xb7f5b138) at ../../gcc-svn/trunk/gcc/fold-const.c:12892
#4  0x0823b9f7 in fold_convert (type=0xb7eb5360, arg=0xb7f5b138) at
../../gcc-svn/trunk/gcc/fold-const.c:2281
#5  0x0894307c in chrec_convert_1 (type=0xb7eb5360, chrec=0xb7f5b138,
at_stmt=0xb7f55e00, use_overflow_semantics=1 '\001') at
../../gcc-svn/trunk/gcc/tree-chrec.c:1308
#6  0x0842482b in interpret_rhs_modify_stmt (loop=0xb7f57c38,
at_stmt=0xb7f55e00, opnd1=0xb7f4e1a0, type=0xb7eb5360) at 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32024

[Bug bootstrap/32024] ICE - libgcc2.c:557: internal compiler error: in fold_checksum_tree, at fold-const.c:12652



--- Comment #12 from ubizjak at gmail dot com  2007-06-20 08:59 ---
(In reply to comment #11)
 backtrace:

(gdb) frame 4
#4  0x0823b9f7 in fold_convert (type=0xb7eb5360, arg=0xb7f5b138) at
../../gcc-svn/trunk/gcc/fold-const.c:2281
(gdb) p debug_tree (arg)
 ssa_name 0xb7f5b138
type integer_type 0xb7eb52f4 int sizes-gimplified public SI
size integer_cst 0xb7ea6658 constant invariant 32
unit size integer_cst 0xb7ea6444 constant invariant 4
align 32 symtab 0 alias set 4 canonical type 0xb7eb52f4 precision 32
min integer_cst 0xb7ea6604 -2147483648 max integer_cst 0xb7ea6620
2147483647
pointer_to_this pointer_type 0xb7ebb798
visited var var_decl 0xb7f57114 D.1650 def_stmt gimple_modify_stmt
0xb7f55de4
version 3

So, chrec_convert_1() is sending ssa_name into fold_convert().


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32024

[Bug bootstrap/32024] ICE - libgcc2.c:557: internal compiler error: in fold_checksum_tree, at fold-const.c:12652



--- Comment #13 from ubizjak at gmail dot com  2007-06-20 09:03 ---
svn blame of tree-chrec.c

114057rakdver   /* If we cannot propagate the cast inside the chrec, just
keep the cast.  */
114057rakdver keep_cast:
100718   spop   res = fold_convert (type, chrec);
 97607  ebotcazou 


-- 

ubizjak at gmail dot com changed:

   What|Removed |Added

 CC||spop at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32024

[Bug target/32414] New: [4.1/4.2 Regression] Poor code for inner loop on i386

/* { dg-do compile } */
/* { dg-options -O2 -m32 -mtune=generic } */

typedef unsigned short int uint16_t;
typedef unsigned int uint32_t;

extern int get_src_stride(void);
extern int get_dst_stride(void);

void
foo (uint32_t *pSrc, uint32_t *pDst, uint16_t width, uint16_t height)
{
  uint32_t *dstLine;
  register uint32_t *dst;
  uint32_t *srcLine;
  register uint32_t *src;
  int dstStride, srcStride;
  uint16_t w;

  srcStride = get_src_stride ();
  dstStride = get_dst_stride ();
  dstLine = pDst;
  srcLine = pSrc;

  while (height--)
{
  dst = dstLine;
  dstLine += dstStride;
  src = srcLine;
  srcLine += srcStride;
  w = width;

  while (w--)
*dst++ = *src++ | 0xFF00;
}
}

generates extremely poor code for the inner loop in 4.1 and 4.2:
.L6:
movl-16(%ebp), %eax # src,
subw$1, -34(%ebp)   #, w
addl$4, -16(%ebp)   #, src
movl(%eax), %ecx#,
movl-20(%ebp), %eax # dst,
orl $-16777216, %ecx#,
movl%ecx, (%eax)#,
addl$4, %eax#,
cmpw$-1, -34(%ebp)  #, w
movl%eax, -20(%ebp) #, dst
je  .L4 #,
jmp .L6 #

I believe this has been introduced by the
http://gcc.gnu.org/ml/gcc-patches/2005-07/msg02021.html
patch and fixed by
http://gcc.gnu.org/ml/gcc-patches/2007-01/msg02095.html
on the trunk.  The generated loop isn't perfect on the trunk:
.L4:
movl(%ebx), %eax#* src, tmp82
addl$4, %ebx#, src
subw$1, -14(%ebp)   #, w
orl $-16777216, %eax#, tmp82
movl%eax, (%edi)# tmp82,* dst
addl$4, %edi#, dst
cmpw$-1, -14(%ebp)  #, w
je  .L3 #,
jmp .L4 #
but still far better than what 4.1 and 4.2 generate.

Slightly modified:
typedef unsigned short int uint16_t;
typedef unsigned int uint32_t;

extern int get_src_stride(void);
extern int get_dst_stride(void);

void
foo (uint32_t *pSrc, uint32_t *pDst, uint16_t width, uint16_t height)
{
  uint32_t *dstLine;
  register uint32_t *dst;
  uint32_t *srcLine;
  register uint32_t *src;
  int dstStride, srcStride;
  uint32_t w;

  srcStride = get_src_stride ();
  dstStride = get_dst_stride ();
  dstLine = pDst;
  srcLine = pSrc;

  while (height--)
{
  dst = dstLine;
  dstLine += dstStride;
  src = srcLine;
  srcLine += srcStride;
  for (w = 0; w  width; w++)
dst[w] = src[w] | 0xFF00;
}
}
generates more compact code:
.L4:
movl(%edx,%ecx,4), %eax #* srcLine, tmp79
orl $-16777216, %eax#, tmp79
movl%eax, (%ebx,%ecx,4) # tmp79,* dstLine
addl$1, %ecx#, w
cmpl%esi, %ecx  # width, w
jae .L3 #,
jmp .L4 #


-- 
   Summary: [4.1/4.2 Regression] Poor code for inner loop on i386
   Product: gcc
   Version: 4.1.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jakub at gcc dot gnu dot org
GCC target triplet: i386-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32414

[Bug middle-end/32285] [4.1 Regression] Miscompilation with pure _Complex returning call inside another fn's argument list



--- Comment #13 from jakub at gcc dot gnu dot org  2007-06-20 09:24 ---
Fixed in SVN, the performance regression caused by PR25550 patch is still
present though.


-- 

jakub at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32285

[Bug fortran/32404] Wrong-code with sbdart (valgrind errors, different output)

2007-06-20 Thread fxcoudert at gcc dot gnu dot org



--- Comment #1 from fxcoudert at gcc dot gnu dot org  2007-06-20 09:41 
---
I can reproduce what you see, nor make any sense of the instructions comparing
the outputs: when I do what you indicate, all I see in the output is a
namelist-type of scalar values:

INPUT
 IDATM=  4,
 AMIX= -1.00 ,
 ISAT=  0,
 WLINF= 0.55011920929 ,
 WLSUP= 0.55011920929 ,
 WLINC=  0.00 ,
 SZA=  0.00 ,
 CSZA= -1.00 ,
 SOLFAC=  1.00 ,
 NF=  2,
 IDAY=  0,
 TIME=  16.0 ,
 ALAT= -64.7669982910156 , 
 ALON= -64.0670013427734 , 
 ZPRES= -1.00 ,  
 PBAR= -1.00 ,
 SCLH2O= -1.00 ,  
 UW= -1.00 ,
 UO3= -1.00 ,
 O3TRP= -1.00 ,
 ZTRP=  0.00 , 
 XRSC=  1.00 , 
 XN2= -1.00 ,
 XO2= -1.00 ,
 XCO2= -1.00 ,  
 XCH4= -1.00 ,
 XN2O= -1.00 ,
 XCO= -1.00 , 
 XNO2= -1.00 ,
 XSO2= -1.00 ,
 XNH3= -1.00 ,   
 XNO= -1.00 ,
 XHNO3= -1.00 ,
 XO4=  1.00 ,   
 ISALB=  0,
 ALBCON=  0.00 ,
 SC= 5*3.402823466385289E+038 ,
 ZCLOUD= 5*0.00   ,
 TCLOUD= 5*0.00   ,
 LWP= 5*0.00   ,
 NRE= 5*8.00   ,
 RHCLD= -1.00 ,
 KRHCLR=  0,
 JAER= 5*0  ,
 ZAER= 5*0.00   ,
 TAERST= 5*0.00   ,
 IAER=  0,
 VIS= -1.00 ,
 RHAER= -1.00 ,
 TBAER= -1.00 ,
 WLBAER= 150*-1.00  ,
 QBAER= 150*-1.00  ,
 ABAER=  0.00 ,
 WBAER= 150*-1.00  ,
 GBAER= 150*-1.00  ,
 PMAER= 44850*-1.00  ,
 ZBAER= 65*-1.00  ,
 DBAER= 65*-1.00  ,
 NOTHRM= -1,
 NOSCT=  0,
 KDIST=  3,
 ZGRID1=  1.00 ,
 ZGRID2=  30.0 ,
 NGRID=  0,
 IDB= 20*0  ,
 ZOUT=  0.00 ,  100. ,
 IOUT= 10,
 PRNT= 7*F,
 TEMIS=  0.00 ,
 NSTR=  0,
 NZEN=  0,
 UZEN= 40*-1.00  ,
 VZEN= 40*90.0   ,
 NPHI=  0,
 PHI= 40*-1.00  ,
 SAZA=  180. ,
 IMOMC=  3,
 IMOMA=  3,
 TTEMP= -1.00 ,
 BTEMP= -1.00 ,
 CORINT=F,
 SPOWDER=F,  /

What do you call columns in this output?


-- 

fxcoudert at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Keywords|wrong-code  |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32404

[Bug tree-optimization/32199] jc1: out of memory allocating 4072 bytes after a total of 805021000 bytes

2007-06-20 Thread ro at gcc dot gnu dot org



--- Comment #3 from ro at gcc dot gnu dot org  2007-06-20 09:44 ---
I observe the same problem (also affecting gnu-xml.lo) on
alpha-dec-osf{4.0f,5.1b}.
VM consumption for org-omg.lo is at 1.5 GB now, a machine with 768 MB physical 
memory crawls along for hours compiling the file.  I had to double swap space
from 2 GB to 4 GB to be able to compile at all, while there was on such problem
in gcc 4.2.0 as of 20070506.


-- 

ro at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||ro at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32199

[Bug inline-asm/32109] [4.1/4.2/4.3 regression] ICE with inline-asm and class with destructor



--- Comment #5 from jakub at gcc dot gnu dot org  2007-06-20 09:46 ---
Fixed.


-- 

jakub at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32109

[Bug middle-end/31959] [4.3 Regression] ICE in expand_builtin_expect, at builtins.c:5112



--- Comment #5 from jakub at gcc dot gnu dot org  2007-06-20 09:48 ---
Fixed.


-- 

jakub at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31959

[Bug target/32414] [4.1/4.2 Regression] Poor code for inner loop on i386

2007-06-20 Thread rguenth at gcc dot gnu dot org



-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu dot
   ||org
   Keywords||missed-optimization
   Target Milestone|--- |4.1.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32414

[Bug bootstrap/12019] check for working C compiler on newlib targets fails due to missing crt0.o

2007-06-20 Thread rask at sygehus dot dk



--- Comment #2 from rask at sygehus dot dk  2007-06-20 11:00 ---
Does it work for you if you apply patch 3 from bug other/32154?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12019

[Bug other/32154] sim-crt0.o/crt0.o isn't found during configure due to missing -L or -B

2007-06-20 Thread bonzini at gnu dot org



--- Comment #10 from bonzini at gnu dot org  2007-06-20 11:13 ---
DJ, do you think this patch is ok?


-- 

bonzini at gnu dot org changed:

   What|Removed |Added

 CC||bonzini at gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32154

[Bug tree-optimization/31866] [4.3 Regression] ICE with tree check error: expected ssa_name, have var_decl in create_outofssa_var_map