Re: missing optimization - don't compute return value not used?

2007-09-26 Thread Richard Li
Right, page 211 of the C++ standard (2003) explains when copy-ctor and
dtor are allowed to be optimized away. But the two circumstances are
both like this:
A is constructed; A is copy-constructed to B; A is destructed
Here A is a temporary object in some sense, and the standard allows
for directly constructing B.
However, Neal expected the compiler to optimize "A is constructed; A
is destructed" away. I find nowhere in the standard that allows this.


ps, Did you forget to put [EMAIL PROTECTED] in the cc list?

On 9/26/07, Michael Veksler <[EMAIL PROTECTED]> wrote:
> But, according to the C++ standard, the compiler is allowed to optimize
> copy construction away. GCC does that in many occasions . For example try:
> |
> #include 
> using namespace std;
> struct T   {
> T() { }
> T(const T&) { cout << "!!! copy ctor !!! \n"; }
> };
> T f()  { T t; return t;}
> int main()
> {
>   cout << "No copy\n";
>   T no_copy= f();
>
>   cout << "Expecting copy\n";
>   T copy= no_copy;
> }
> |


Re: [libobjc] Shouldn't "version" take a long?

2007-09-26 Thread Jose Quinteiro
I wrote a simple test program that works just fine on my 64 bit system.
 The problem must lie somewhere in the GNUStep libraries.  Sorry about
the waste of bandwidth.

Thanks,
Jose.

Jose Quinteiro wrote:
> Please forgive me if I'm being dense, I'm very new to Objective-C.
> 
> The problem was that a class in GNUMail (PGPController) implemented a
> method thusly:
> 
> - (NSString *) version
> {
>   return @"v0.9.1";
> }
> 
> That method is declared in the GNUMailBundle protocol.  GNUMail would
> segfault when the pointer returned by that method was accessed in any
> way.  I looked at it in gdb and figured out that the pointer was being
> truncated to 4 bytes (I have an LP64 system.)  Renaming the method made
> the truncation and crash go away.
> 
> Object.m declares a "version" class method that returns an int. NSObject
> in the GNUStep base library also has a "version" class method that
> returns an int.   I don't know what the rules are in Objective-C for
> overloading by return type, but gcc did not complain when it compiled
> PGPController. (The output from gcc 4.1.2 for that class is at the end
> of this message.)
> 
> Incidentally, what are the rules for overloading by return type?  I
> looked around and all I found was this:
> 
> "Although identically named class methods and instance methods are
> represented by the same selector, they can have different argument and
> return types."
> http://developer.apple.com/documentation/Cocoa/Conceptual/ObjectiveC/Articles/chapter_4_section_6.html
> 
> 
> Which makes me think that what GNUMail does should be allowed.  I don't
> know where the confusion between pointer and int is coming in.
> 
> Thanks,
> Jose.
> 
> 
> 
> 
> x86_64-pc-linux-gnu-gcc PGPController.m -c \
>   -MMD -MP -DGNUSTEP -DGNUSTEP_BASE_LIBRARY=1
> -DGNU_GUI_LIBRARY=1 -DGNU_RUNTIME=1 -DGNUSTEP_BASE_LIBRARY=1
> -D_REENTRANT -fPIC -g -Wall -DDEBUG -fno-omit-frame-pointer -DGSWARN
> -DGSDIAGNOSE -Wno-import -march=nocona -pipe -fno-strict-aliasing
> -fexceptions -fobjc-exceptions -D_NATIVE_OBJC_EXCEPTIONS -fgnu-runtime
> -Wall -Wno-import -fconstant-string-class=NSConstantString
> -I../../Framework/GNUMail -I.
> -I/var/tmp/portage/gnustep-apps/gnumail-1.2.0_pre3/temp/GNUstep/Library/Headers
> -I/usr/GNUstep/Local/Library/Headers
> -I/usr/GNUstep/System/Library/Headers \
>-o obj/PGPController.o
> PGPController.m:743: warning: incomplete implementation of class
> 'PGPController'
> PGPController.m:743: warning: method definition for
> '-bodyWasDecoded:forMessage:' not found
> PGPController.m:743: warning: method definition for
> '-bodyWillBeDecoded:forMessage:' not found
> PGPController.m:743: warning: method definition for
> '-bodyWasEncoded:forMessage:' not found
> PGPController.m:743: warning: method definition for
> '-bodyWillBeEncoded:forMessage:' not found
> PGPController.m:743: warning: method definition for
> '-composeViewAccessoryWillBeRemovedFromSuperview:' not found
> PGPController.m:743: warning: class 'PGPController' does not fully
> implement the 'GNUMailBundle' protocol
> 
> 
> Andrew Pinski wrote:
>> On 9/26/07, Jose Quinteiro <[EMAIL PROTECTED]> wrote:
>>> Hello,
>>>
>>> The getter/setter for version in Object.M gets/takes an int, and they
>>> eventually get/set the "version" field in struct objc_class.   This
>>> field is declared as a long in objc/objc.h.
>>
>> Why?  Any change here will change the ABI so it incorrect thing to do.
>> So the GNUMail issue seems like people are ignoring warnings that GCC
>> generates about methods are not declared in the class.
>>
>> -- Pinski
#import 
#import 
#import 

@interface Version: Object

- (NXConstantString *) version;
- (NXConstantString *) myVersion;
+ (id) singleInstance;

@end

static Version *s_instance = nil;

@implementation Version;
- (NXConstantString *) version
{
  return @"BOOM";
}

- (NXConstantString *) myVersion
{
  return @"No worries";
}

+ (id) singleInstance
{
  if (!s_instance)
{
  s_instance = [[Version alloc] init];
}

  return s_instance;
}   


@end

int main (int argc, char *argv[])
{
  Version *l_version = [Version singleInstance];
  
  printf("%s\n", [[l_version myVersion] cString]);
  printf("%s\n", [[l_version version] cString]);

  return 0;
}


Re: [libobjc] Shouldn't "version" take a long?

2007-09-26 Thread Jose Quinteiro

Please forgive me if I'm being dense, I'm very new to Objective-C.

The problem was that a class in GNUMail (PGPController) implemented a 
method thusly:


- (NSString *) version
{
  return @"v0.9.1";
}

That method is declared in the GNUMailBundle protocol.  GNUMail would 
segfault when the pointer returned by that method was accessed in any 
way.  I looked at it in gdb and figured out that the pointer was being 
truncated to 4 bytes (I have an LP64 system.)  Renaming the method made 
the truncation and crash go away.


Object.m declares a "version" class method that returns an int. 
NSObject in the GNUStep base library also has a "version" class method 
that returns an int.   I don't know what the rules are in Objective-C 
for overloading by return type, but gcc did not complain when it 
compiled PGPController. (The output from gcc 4.1.2 for that class is at 
the end of this message.)


Incidentally, what are the rules for overloading by return type?  I 
looked around and all I found was this:


"Although identically named class methods and instance methods are 
represented by the same selector, they can have different argument and 
return types."

http://developer.apple.com/documentation/Cocoa/Conceptual/ObjectiveC/Articles/chapter_4_section_6.html

Which makes me think that what GNUMail does should be allowed.  I don't 
know where the confusion between pointer and int is coming in.


Thanks,
Jose.




x86_64-pc-linux-gnu-gcc PGPController.m -c \
  -MMD -MP -DGNUSTEP -DGNUSTEP_BASE_LIBRARY=1 
-DGNU_GUI_LIBRARY=1 -DGNU_RUNTIME=1 -DGNUSTEP_BASE_LIBRARY=1 
-D_REENTRANT -fPIC -g -Wall -DDEBUG -fno-omit-frame-pointer -DGSWARN 
-DGSDIAGNOSE -Wno-import -march=nocona -pipe -fno-strict-aliasing 
-fexceptions -fobjc-exceptions -D_NATIVE_OBJC_EXCEPTIONS -fgnu-runtime 
-Wall -Wno-import -fconstant-string-class=NSConstantString 
-I../../Framework/GNUMail -I. 
-I/var/tmp/portage/gnustep-apps/gnumail-1.2.0_pre3/temp/GNUstep/Library/Headers 
-I/usr/GNUstep/Local/Library/Headers -I/usr/GNUstep/System/Library/Headers \

   -o obj/PGPController.o
PGPController.m:743: warning: incomplete implementation of class 
'PGPController'
PGPController.m:743: warning: method definition for 
'-bodyWasDecoded:forMessage:' not found
PGPController.m:743: warning: method definition for 
'-bodyWillBeDecoded:forMessage:' not found
PGPController.m:743: warning: method definition for 
'-bodyWasEncoded:forMessage:' not found
PGPController.m:743: warning: method definition for 
'-bodyWillBeEncoded:forMessage:' not found
PGPController.m:743: warning: method definition for 
'-composeViewAccessoryWillBeRemovedFromSuperview:' not found
PGPController.m:743: warning: class 'PGPController' does not fully 
implement the 'GNUMailBundle' protocol



Andrew Pinski wrote:

On 9/26/07, Jose Quinteiro <[EMAIL PROTECTED]> wrote:

Hello,

The getter/setter for version in Object.M gets/takes an int, and they
eventually get/set the "version" field in struct objc_class.   This
field is declared as a long in objc/objc.h.


Why?  Any change here will change the ABI so it incorrect thing to do.
So the GNUMail issue seems like people are ignoring warnings that GCC
generates about methods are not declared in the class.

-- Pinski


Re: deadline extension for debug info project into GCC 4.3 stage3?

2007-09-26 Thread Alexandre Oliva
On Sep 11, 2007, Mark Mitchell <[EMAIL PROTECTED]> wrote:

> That's a possibility, but I don't want to commit at this point.  We can
> have a look at it when you submit it and decide.  However, in general,
> introducing churn for the sake of a feature that will be off by default
> isn't something that I would want to do.  The more compartmentalized you
> make it, the better your chances are.

It's nearly impossible to make the patch compartmentalized, but pretty
much all of the changes would be clearly disabled and no-ops unless
the flag was given in the command line.

That said, I found out there's still a long way to go before this is
actually a no-op in terms of generated code (other than debug info,
that is), as far as testsuite results, target libraries and other
ports are concerned, so I'm thinking this is very unlikely to make 4.3
:-(

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}


Re: [libobjc] Shouldn't "version" take a long?

2007-09-26 Thread Andrew Pinski
On 9/26/07, Jose Quinteiro <[EMAIL PROTECTED]> wrote:
> Hello,
>
> The getter/setter for version in Object.M gets/takes an int, and they
> eventually get/set the "version" field in struct objc_class.   This
> field is declared as a long in objc/objc.h.

Why?  Any change here will change the ABI so it incorrect thing to do.
So the GNUMail issue seems like people are ignoring warnings that GCC
generates about methods are not declared in the class.

-- Pinski


Re: Require help with the backend in gcc

2007-09-26 Thread V. Karthik Kumar
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thank you for a quick reply! :)

I was to do dynamic cpu dispatching code (which could also benefit
autovectorization). I'm checking out the trunk now.

I've already implemented an userland library to do cpu detection, and
the initialization hook goes into expand_main; I'm looking forward to
using this framework.

Thank you once again!

Karthik

"V. Karthik Kumar" <[EMAIL PROTECTED]> writes:

> I require help with some work i've been doing lately on gcc (4.2.1
> tree). I have managed to put some code in.
>
> Now, for function compilation, i require to invoke the i386 (as of
> now) backend multiple times and with different switches; one time with
> 3dnow!, another with mmx, another with sse2. And floating point code
> will require usage of both 387 and sse.
>
> It is intended that multiple archs are passed as part of the -march
> option. (-m3dnow,sse2,sse3  or -march=k6,nocona or maybe
> -march=sse2,sse3)
>
> Eventually, some of this code may also get used in the PowerPC backend
> (one time with AltiVec and one time without)..
>
> it is intended that multiple versions of functions would go into the
> assembly file; The global data segment would be created only once,
> aligned to targeted instruction sets' requirement for efficient
> accesses (if mmx and sse, alignment might be 8 bytes than 4).
>
> I have started working, but not sure which parts I should be touching
> by now.
>
> Please give me your suggestions. Any help in this regard would be
> greatly appreciated.

The first thing you need to do is move to mainline, and take advantage
of the patch by Sandra Loosemore, David Ung, and Nigel Stephens to add
TARGET_SET_CURRENT_FUNCTION.  That is the framework you'll need to
change the architecture on a per-function basis.

Then you'll need to figure out how to clone the function the way the
inliner does, and presumably give each copy different names or
something--you didn't mention how you plan to handle that issue.

Then you can wind up compiling each function with different
parameters.

This is a nontrivial exercise, but an interesting one.  Good luck.

Ian

- --
- -BEGIN PGP PUBLIC KEY BLOCK-
Version: GnuPG v1.4.7 (GNU/Linux)

mQGiBEa0y88RBACpSIuwbUvraagYtkWKMlwe+KI6Sh2UU2vipE8Fotkrq/iTnRiK
pu2dJcP+jTNvbatcLGedWQOHiCvGfadZD/SxmYsJpQXazL/CORGvdzZwq4eBsDVV
94E/pibIT6ouaOFVMsvARPOyk+Q6N8T/tsvtCxFYrx/NnUIoMdb1DCXEZwCgs90U
9xQExo7OfJYyafTYLyXSzbsD/jqNhMJwnNsT+/GOqDeod98s54IImpgVA/bGyOQi
ek+l2SGlrZ6LmZzGO/zVRqsPISAm7Wa5xbVe6qL+hUr1XIFOQoj+08yOCYPDrPoh
m4QtFQHKlr5E0u6ev188wI6uIyz6jpzt6C/Aq3Q4irCj3Graeg9xGnHgsjMujubR
WebABACgJzTS2mfEu5Rb75+KlgGgnA8zkTpf/Qqdwk/eo1WZPbcIijROEP4MNhVS
IWacQXt4Ng8aWviFTZvysAc4k4hxnmFJgyRcUOSOmYd3uWkQI0OV1+cS5FoXmiQ2
Oucsw4iBC3VHqQmNhtuCNZ4Nx1v0kexqfBQCRBSB3HGXGBKjQ7REVi4gS2FydGhp
ayBLdW1hciAoaHR0cDovL2d1aWx0LmJhZnNvZnQubmV0KSA8a2FydGhpa2t1bWFy
QGdtYWlsLmNvbT6IZgQTEQIAJgUCRrTLzwIbAwUJAeEzgAYLCQgHAwIEFQIIAwQW
AgMBAh4BAheAAAoJEEc052Xw3SBP+dkAoI5xfNw/7M7OVpmquFAwRb0k9KbYAJ0e
IOypL+F8bUsxqISUIw3GFeb60LkEDQRGtMvPEBAA4SptM/eorjFWmC1S7xBfvKMF
UMyFQvkwiWtDsWIrD0AMU4acT7fjYlMEKmVsaymXppxyvK6e/4jOX72UcsJZ4LL/
jtm4SGfknC6yEXdeyYz5Mmd6CN52LC/KfS4b771zO9yMDAl79/FxHIR7AvoSWb14
sbc7yKiF7OwfEFeZNtOYsZwDsQabnuFd5mzIMev/W2hgs55DF4ZJnmaVYb/PQbbw
X5g7OwsN17OESPF/syaCzqKJ0GuzhnGHYgwY/84eeWkzqnGTxG52HH6Y1sYwKEmJ
32XLkUEHxHKoCvMW8C6E/s72Aw/WrBzq2yHhqW5npBrCIBCYWC70wzkew2DaOG/j
WYtRP2ahJKxV9598D86w97M+6kNX2efMdSgJyLlFyyXlqX95sobE8BJRxjXqkiJy
uaLgXv3CQZ1+kizhnkZeInA85NNahb3f2j/jA03eVoAhRq64fqN8W1kfvQwv
YF31G7dsLLI2gx2ui1ouj6phIIlZPzypoDkoYZiXn3qXMDiyxJb+4wT5MZz2hjTU
w51Z5WPe2ylPXKPqmiDw6zMNQW8OWVIXFljxLcRAhY9DQC9MIgnw7wCz9Bdu8sUs
kkZjSsLo6Mc1SPCwjcuD8bDuvc7JIugNn/QFrLtV0o/BVpxMX0ujm2gC8/y7ruBJ
cGPvx99e7lj7cmgac7cAAwUP/2h7MDCA3o1Bt1mInBlC+LHdJIaipToVc72lF8nN
H3InjMppUkgvHQ+D/4r5hcWtskkRY+YG1iG45RbWMQlprfONOWEYfjkc/WDRj+PO
lFszhcOSc2IlgCYsY1yEIF6HfE2MZpFWjM0z0hjotEULxlvi9meMV0OZRqwDdhEp
871jk1+3WkdjGMcZI3AO3wGRwb60eYW0cVNMv1umH0Cgh2pgU/vTbCqB7P5DaNHf
BxflFAWumm7P70qJMoCa9SRNKh7vitlLBLGnSuhgT22aE/N/zslcprS7tFM3JFAl
Jvr9V3pXzMmkk4zGwzpfvA8LpCPNVqqABrkGsduTsTyoPjLDmPH/CuFMu6RZCUHL
sSKKhTpbE3zTgmyGja8DiJFKWmtojFPDEnPDSQweJYItkfnGSbHQVx5wkKhjABQ6
bCraNqgem0C+tKnDoRk/NlhKBCpGVdt8kIRNZ+iTA+4VB+R1usUY3ZpvrHYZFDX5
RxJ4jYLnhlKspSYvKkLg4IP7KnGr9dC16XJCa2wqR68EJa0u5XxigV4zscaawGYA
Mx56+PoouaWI24+9JUPTMkV3UvF5xU2BumOW/IsKqs2qYEkG3QdczVwTnNuAZFQa
1WJAKOT7elDsrYsrdGWZpge2d/uoIFKjDobz7eZnkSLiX2rIzbkDniDD+aZd8G60
NVJliE8EGBECAA8FAka0y88CGwwFCQHhM4AACgkQRzTnZfDdIE91ngCgpgLiwwXQ
MbyOCWjuWGY+phmYeagAnj7nMffLNWLpfVmKtA4yrtOHkSAM
=RuU8
- -END PGP PUBLIC KEY BLOCK-
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG+qbnRzTnZfDdIE8RAttoAJ43zPb6TvQ32exV0O2oIER1E7jLegCfWcIW
1WHocs+ml0VSyx/q2g8PmMU=
=wBtI
-END PGP SIGNATURE-



[libobjc] Shouldn't "version" take a long?

2007-09-26 Thread Jose Quinteiro

Hello,

The getter/setter for version in Object.M gets/takes an int, and they 
eventually get/set the "version" field in struct objc_class.   This 
field is declared as a long in objc/objc.h.


I'm asking because I think this was causing a crash in GNUMail on 64-bit 
systems.  More detail:


https://bugs.gentoo.org/show_bug.cgi?id=193806

Thanks,
Jose.


Re: [Bug target/33479] SyncTest Intermittent failing on MIPS

2007-09-26 Thread David Daney

Boehm, Hans wrote:

David -

If I understand this correctly, you added a sync instruction to most of
the __sync implementations on MIPS,


Correct.  These primitives are new in GCC-4.3 for MIPS.  My first 
attempt was not entirely successful.  I hope the revised version yields 
better results.



largely because we don't understand
what the memory ordering rules actually are?


It is not purely due to laziness on my part, although I don't discount 
this totally.  There are so many different MIPS implementations with 
differing behavior (some lacking sequential memory ordering) that the 
only safe thing to do in the general case is to assume that 'sync' is 
required as a memory barrier *and* after the atomic ll/sc loops.




I would guess this
involves a significant performance hit?  (On other architectures, fences
tend to be between 10 and 200 cycles, best case.  I haven't checked on
MIPS.)


Again, it depends on the MIPS implementation.



This may well be the best we can do, but it doesn't strike me as good.
It's also a problem for the C++ memory model work.


If it ends up being anything like the Java memory model, the MIPS 
implementation will likely need to emit 'sync' instructions with 
accesses to volatile variables.





If my understanding is correct, and you don't mind, I would like to
relay this to the MIPS architects.  It might motivate them to document
some things sooner rather than later.



That is fine with me.



Was this checked in?



It was, thus the commit message in the PR.

David Daney





Re: Require help with the backend in gcc

2007-09-26 Thread Ian Lance Taylor
"V. Karthik Kumar" <[EMAIL PROTECTED]> writes:

> I require help with some work i've been doing lately on gcc (4.2.1
> tree). I have managed to put some code in.
> 
> Now, for function compilation, i require to invoke the i386 (as of
> now) backend multiple times and with different switches; one time with
> 3dnow!, another with mmx, another with sse2. And floating point code
> will require usage of both 387 and sse.
> 
> It is intended that multiple archs are passed as part of the -march
> option. (-m3dnow,sse2,sse3  or -march=k6,nocona or maybe
> -march=sse2,sse3)
> 
> Eventually, some of this code may also get used in the PowerPC backend
> (one time with AltiVec and one time without)..
> 
> it is intended that multiple versions of functions would go into the
> assembly file; The global data segment would be created only once,
> aligned to targeted instruction sets' requirement for efficient
> accesses (if mmx and sse, alignment might be 8 bytes than 4).
> 
> I have started working, but not sure which parts I should be touching
> by now.
> 
> Please give me your suggestions. Any help in this regard would be
> greatly appreciated.

The first thing you need to do is move to mainline, and take advantage
of the patch by Sandra Loosemore, David Ung, and Nigel Stephens to add
TARGET_SET_CURRENT_FUNCTION.  That is the framework you'll need to
change the architecture on a per-function basis.

Then you'll need to figure out how to clone the function the way the
inliner does, and presumably give each copy different names or
something--you didn't mention how you plan to handle that issue.

Then you can wind up compiling each function with different
parameters.

This is a nontrivial exercise, but an interesting one.  Good luck.

Ian


Re: support single predicate set instructions in GCC-4.1.1

2007-09-26 Thread Jim Wilson
On Wed, 2007-09-26 at 23:35 +0800, 吴曦 wrote:
> Thanks, it's the problem of pass_stack_adjustments.

pass_stack_adjustments isn't in gcc-4.2.x; it is only on mainline.  But
the flow stuff you are using isn't on mainline anymore since the
dataflow merge.  Maybe you are using a month or two old snapshot of
mainline?  This will limit the help I can provide, since I only have
copies of mainline and gcc-4.2.x to look at, neither of which matches
what you are working with.
-- 
Jim Wilson, GNU Tools Support, http://www.specifix.com




Require help with the backend in gcc

2007-09-26 Thread V. Karthik Kumar
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I require help with some work i've been doing lately on gcc (4.2.1
tree). I have managed to put some code in.

Now, for function compilation, i require to invoke the i386 (as of
now) backend multiple times and with different switches; one time with
3dnow!, another with mmx, another with sse2. And floating point code
will require usage of both 387 and sse.

It is intended that multiple archs are passed as part of the -march
option. (-m3dnow,sse2,sse3  or -march=k6,nocona or maybe
- -march=sse2,sse3)

Eventually, some of this code may also get used in the PowerPC backend
(one time with AltiVec and one time without)..

it is intended that multiple versions of functions would go into the
assembly file; The global data segment would be created only once,
aligned to targeted instruction sets' requirement for efficient
accesses (if mmx and sse, alignment might be 8 bytes than 4).

I have started working, but not sure which parts I should be touching
by now.

Please give me your suggestions. Any help in this regard would be
greatly appreciated.

Regards,
Karthik

- -BEGIN PGP PUBLIC KEY BLOCK-
Version: GnuPG v1.4.7 (GNU/Linux)

mQGiBEa0y88RBACpSIuwbUvraagYtkWKMlwe+KI6Sh2UU2vipE8Fotkrq/iTnRiK
pu2dJcP+jTNvbatcLGedWQOHiCvGfadZD/SxmYsJpQXazL/CORGvdzZwq4eBsDVV
94E/pibIT6ouaOFVMsvARPOyk+Q6N8T/tsvtCxFYrx/NnUIoMdb1DCXEZwCgs90U
9xQExo7OfJYyafTYLyXSzbsD/jqNhMJwnNsT+/GOqDeod98s54IImpgVA/bGyOQi
ek+l2SGlrZ6LmZzGO/zVRqsPISAm7Wa5xbVe6qL+hUr1XIFOQoj+08yOCYPDrPoh
m4QtFQHKlr5E0u6ev188wI6uIyz6jpzt6C/Aq3Q4irCj3Graeg9xGnHgsjMujubR
WebABACgJzTS2mfEu5Rb75+KlgGgnA8zkTpf/Qqdwk/eo1WZPbcIijROEP4MNhVS
IWacQXt4Ng8aWviFTZvysAc4k4hxnmFJgyRcUOSOmYd3uWkQI0OV1+cS5FoXmiQ2
Oucsw4iBC3VHqQmNhtuCNZ4Nx1v0kexqfBQCRBSB3HGXGBKjQ7REVi4gS2FydGhp
ayBLdW1hciAoaHR0cDovL2d1aWx0LmJhZnNvZnQubmV0KSA8a2FydGhpa2t1bWFy
QGdtYWlsLmNvbT6IZgQTEQIAJgUCRrTLzwIbAwUJAeEzgAYLCQgHAwIEFQIIAwQW
AgMBAh4BAheAAAoJEEc052Xw3SBP+dkAoI5xfNw/7M7OVpmquFAwRb0k9KbYAJ0e
IOypL+F8bUsxqISUIw3GFeb60LkEDQRGtMvPEBAA4SptM/eorjFWmC1S7xBfvKMF
UMyFQvkwiWtDsWIrD0AMU4acT7fjYlMEKmVsaymXppxyvK6e/4jOX72UcsJZ4LL/
jtm4SGfknC6yEXdeyYz5Mmd6CN52LC/KfS4b771zO9yMDAl79/FxHIR7AvoSWb14
sbc7yKiF7OwfEFeZNtOYsZwDsQabnuFd5mzIMev/W2hgs55DF4ZJnmaVYb/PQbbw
X5g7OwsN17OESPF/syaCzqKJ0GuzhnGHYgwY/84eeWkzqnGTxG52HH6Y1sYwKEmJ
32XLkUEHxHKoCvMW8C6E/s72Aw/WrBzq2yHhqW5npBrCIBCYWC70wzkew2DaOG/j
WYtRP2ahJKxV9598D86w97M+6kNX2efMdSgJyLlFyyXlqX95sobE8BJRxjXqkiJy
uaLgXv3CQZ1+kizhnkZeInA85NNahb3f2j/jA03eVoAhRq64fqN8W1kfvQwv
YF31G7dsLLI2gx2ui1ouj6phIIlZPzypoDkoYZiXn3qXMDiyxJb+4wT5MZz2hjTU
w51Z5WPe2ylPXKPqmiDw6zMNQW8OWVIXFljxLcRAhY9DQC9MIgnw7wCz9Bdu8sUs
kkZjSsLo6Mc1SPCwjcuD8bDuvc7JIugNn/QFrLtV0o/BVpxMX0ujm2gC8/y7ruBJ
cGPvx99e7lj7cmgac7cAAwUP/2h7MDCA3o1Bt1mInBlC+LHdJIaipToVc72lF8nN
H3InjMppUkgvHQ+D/4r5hcWtskkRY+YG1iG45RbWMQlprfONOWEYfjkc/WDRj+PO
lFszhcOSc2IlgCYsY1yEIF6HfE2MZpFWjM0z0hjotEULxlvi9meMV0OZRqwDdhEp
871jk1+3WkdjGMcZI3AO3wGRwb60eYW0cVNMv1umH0Cgh2pgU/vTbCqB7P5DaNHf
BxflFAWumm7P70qJMoCa9SRNKh7vitlLBLGnSuhgT22aE/N/zslcprS7tFM3JFAl
Jvr9V3pXzMmkk4zGwzpfvA8LpCPNVqqABrkGsduTsTyoPjLDmPH/CuFMu6RZCUHL
sSKKhTpbE3zTgmyGja8DiJFKWmtojFPDEnPDSQweJYItkfnGSbHQVx5wkKhjABQ6
bCraNqgem0C+tKnDoRk/NlhKBCpGVdt8kIRNZ+iTA+4VB+R1usUY3ZpvrHYZFDX5
RxJ4jYLnhlKspSYvKkLg4IP7KnGr9dC16XJCa2wqR68EJa0u5XxigV4zscaawGYA
Mx56+PoouaWI24+9JUPTMkV3UvF5xU2BumOW/IsKqs2qYEkG3QdczVwTnNuAZFQa
1WJAKOT7elDsrYsrdGWZpge2d/uoIFKjDobz7eZnkSLiX2rIzbkDniDD+aZd8G60
NVJliE8EGBECAA8FAka0y88CGwwFCQHhM4AACgkQRzTnZfDdIE91ngCgpgLiwwXQ
MbyOCWjuWGY+phmYeagAnj7nMffLNWLpfVmKtA4yrtOHkSAM
=RuU8
- -END PGP PUBLIC KEY BLOCK-
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG+pp+RzTnZfDdIE8RArIGAJ9XtceYSNA7hZU8GUjkJkXviXqsJQCgnBYS
5Y+CCpTarGE8ojPiNCb4PEM=
=HpQ2
-END PGP SIGNATURE-



Re: Poor pow() / floating point performance of on x86_64

2007-09-26 Thread Tim Prince

Richard Guenther wrote:

On 9/26/07, Ralf Lübben <[EMAIL PROTECTED]> wrote:
  

Hello,

maybe this is the better list to post the problem (see below).



This is off-topic here, gcc-help would be a more appropriate list.

  
True, but it appears to be a glibc problem, rather than one which can be 
dealt with in gcc.


Re: Poor pow() / floating point performance of on x86_64

2007-09-26 Thread Richard Guenther
On 9/26/07, Ralf Lübben <[EMAIL PROTECTED]> wrote:
> Hello,
>
> maybe this is the better list to post the problem (see below).

This is off-topic here, gcc-help would be a more appropriate list.

> Regards
> Ralf
>
> On Wednesday, 26. September 2007 18:23:34 Ralf Lübben wrote:
> > Ok,
> >
> > the problems seems to be the pow() function. If I use instead the function
> > gsl_pow_int(double x, int n) from the gsl library the performance on the
> > x86_64 machine is much faster.
> > I call the pow function with the following values:
> >
> > pow(5.0,-3.0);
> > pow(10.0,-3.0);
> > pow(15.0,-3.0);
> > pow(20.0,-3.0);

pow and gsl_pow_int don't compute the same thing.  Use -ffast-math and gcc
will do equivalent stuff.

Richard.


Re: Poor pow() / floating point performance of on x86_64

2007-09-26 Thread Ralf Lübben
Hello,

maybe this is the better list to post the problem (see below).

Regards
Ralf

On Wednesday, 26. September 2007 18:23:34 Ralf Lübben wrote:
> Ok,
>
> the problems seems to be the pow() function. If I use instead the function
> gsl_pow_int(double x, int n) from the gsl library the performance on the
> x86_64 machine is much faster.
> I call the pow function with the following values:
>
> pow(5.0,-3.0);
> pow(10.0,-3.0);
> pow(15.0,-3.0);
> pow(20.0,-3.0);
>
> The problem also occurs with gcc 4.2.1, but not with the x86 Ubuntu Feity
> Fawn distribution on the x86_64 machine.
> Sorry for this misinformation before.
>
> Ralf
>
> On Wednesday, 26. September 2007 16:35:50 Ralf Lübben wrote:
> > Hi,
> >
> > I just have tried two other setups on the x86_64 machine:
> >
> > 1. Ubuntu Feisty Fawn (gcc 4.1.2) server x86:
> > - Expected performance: about two times faster than on my notebook
> >
> > 2. Ubuntu Gutsy Gibbon (gcc 4.2.1) server x86:
> > - nearly same performance than "Ubuntu Feisty Fawn (gcc 4.1.2) server
> > x86" - Expected performance: about two times faster than on my notebook
> >
> > Was there a change from gcc 4.1.2 to gcc 4.2.1 which could explain that?
> > Or is there anything else which could explain that?
> >
> > Ralf
> >
> > On Wednesday, 26. September 2007 10:35:20 Ralf Lübben wrote:
> > > Hello,
> > >
> > > in the last days I ran a simulation on a x86_64 architecture:
> > > ###
> > > processor   : 0
> > > vendor_id   : GenuineIntel
> > > cpu family  : 15
> > > model   : 6
> > > model name  :Genuine Intel(R) CPU 3.20GHz
> > > stepping: 8
> > > cpu MHz : 3192.081
> > > cache size  : 8192 KB
> > > physical id : 0
> > > siblings: 2
> > > core id : 0
> > > cpu cores   : 2
> > > fpu : yes
> > > fpu_exception   : yes
> > > cpuid level : 6
> > > wp  : yes
> > > flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> > > mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> > > nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cid cx16 xtpr lahf_lm
> > > bogomips: 6390.34
> > > clflush size: 64
> > > cache_alignment : 128
> > > address sizes   : 40 bits physical, 48 bits virtual
> > > power management:
> > > #
> > >
> > > with very poor performance.
> > >
> > > I ran the same simulations on my notebook:
> > >
> > > ##
> > > processor   : 0
> > > vendor_id   : AuthenticAMD
> > > cpu family  : 6
> > > model   : 8
> > > model name  : mobile AMD Athlon(tm) XP 2000+
> > > stepping: 1
> > > cpu MHz : 797.820
> > > cache size  : 256 KB
> > > fdiv_bug: no
> > > hlt_bug : no
> > > f00f_bug: no
> > > coma_bug: no
> > > fpu : yes
> > > fpu_exception   : yes
> > > cpuid level : 1
> > > wp  : yes
> > > flags   : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca
> > > cmov pat pse36 mmx fxsr sse syscall mp mmxext 3dnowext 3dnow ts fid vid
> > > bogomips: 1596.37
> > > clflush size: 32
> > > ###
> > >
> > > The same simulation is about 10 times faster on my notebook.
> > > The simulation was compiled with "-O3 -ffast-math", without
> > > "-ffast-math" the performance of the x86_64 architecture is much worse.
> > > I used gcc 4.1.2 on Ubuntu, the simulator is Omnet++.
> > >
> > > There was already a post about the topic:
> > > http://gcc.gnu.org/ml/gcc-help/2006-05/msg00185.html
> > > on AMD machines.
> > >
> > > I could also figure out, that one problem ist the pow() function, maybe
> > > there are more functions with poor performance on x86_64 machines.
> > >
> > > Has anyone an idea about the reasons or how to improve the performance
> > > on x86_64 machines?
> > >
> > > Thanks.
> > >
> > > Regards,
> > > Ralf




$RANLIB not passed to libiberty

2007-09-26 Thread Rask Ingemann Lambertsen
   I'm having a look at building GCC with OpenWatcom to reduce build
times. There seems to be something wrong with the build machinery:

$ diff -u {,build-i686-pc-linux-gnuaout/}libiberty/Makefile
--- libiberty/Makefile  2007-09-26 17:02:58.0 +0200
+++ build-i686-pc-linux-gnuaout/libiberty/Makefile  2007-09-26 
17:04:00.0 +0200
@@ -56,7 +56,7 @@
 CC = owcc -O2 -mtune=i686
 CFLAGS = -g
 LIBCFLAGS = $(CFLAGS)
-RANLIB = owranlib
+RANLIB = ranlib
 MAKEINFO = makeinfo --split-size=500
 PERL = perl

The use of the wrong ranlib causes the build of fixincludes to fail. The
fixincludes directories don't have that problem.

My configure command was this:
CC='owcc -O2 -mtune=i686' RANLIB=owranlib ~/cvssrc/ia16-gcc/configure
--target=ia16-unknown-elf --enable-languages=c --with-newlib
--enable-checking=yes,rtl --prefix=$HOME/openwatcom --disable-shared
--with-{gmp,mpfr}=$HOME/openwatcom

-- 
Rask Ingemann Lambertsen
Danish law requires addresses in e-mail to be logged and stored for a year


Re: support single predicate set instructions in GCC-4.1.1

2007-09-26 Thread 吴曦
2007/9/26, Jim Wilson <[EMAIL PROTECTED]>:
> On Tue, 2007-09-25 at 15:13 +0800, 吴曦 wrote:
> > propagate_one_insn), I don't understand why GCC fails the computation
> > of liveness if there is no optimization flag :-(.
>
> There is probably something else happening with -O that is recomputing
> some liveness or CFG info.  For instance, the flow2 pass will call
> split_all_insns and cleanup_cfg, but only with -O.  You could try
> selectively disabling other optimization passes to determine which one
> is necessary in order for your code to work.  Actually, looking closer,
> I see several of them call update_life_info.  regrename for instance has
> two update_life_info calls.
>
> Another possibility here is to try calling recompute_reg_usage instead
> of doing it yourself.  Or maybe calling just update_life_info directly,
> if you need different flags set.
>
> FYI This stuff is all different on mainline since the dataflow merge.
> I'm assuming you are using gcc-4.2.x.
> --
> Jim Wilson, GNU Tools Support, http://www.specifix.com
>
>
>
Thanks, it's the problem of pass_stack_adjustments.


Re: missing optimization - don't compute return value not used?

2007-09-26 Thread Richard Li
In version 1, the return type is "a_t", so a class construction is
required there (the caller will then destruct the returned object).
Construction and destruction can have side effects, so the compiler
would not drop them. For the following code,

template
inline a_t& append (a_t & a, b_t const& b) {
 a.insert (a.end(), b.begin(), b.end());
 return a;
}

it does not require a construction, and would be as fast as version 2.

On 9/26/07, Neal Becker <[EMAIL PROTECTED]> wrote:
> gcc version 4.1.2 20070502 (Red Hat 4.1.2-12)
> I noticed the following code
>
> === version 1:
> template
> inline a_t append (a_t & a, b_t const& b) {
>   a.insert (a.end(), b.begin(), b.end());
>   return a;
> }
>
> === version 2:
> template
> inline void append (a_t & a, b_t const& b) {
>   a.insert (a.end(), b.begin(), b.end());
> }
>
> When instantiated for a_t, b_t std::list.  When called by code that _did
> not use the return value_, I had assumed that since the returned value is
> not used, the 2 versions would be equivalent.  Instead, (compiling
> with -O3), version 2 runs very fast, but version 1 is extremely slow.  Is
> it really necessary to construct the returned value even when it is seen
> that it is not used?
>
>


missing optimization - don't compute return value not used?

2007-09-26 Thread Neal Becker
gcc version 4.1.2 20070502 (Red Hat 4.1.2-12)
I noticed the following code

=== version 1:
template
inline a_t append (a_t & a, b_t const& b) {
  a.insert (a.end(), b.begin(), b.end());
  return a;
}

=== version 2:
template
inline void append (a_t & a, b_t const& b) {
  a.insert (a.end(), b.begin(), b.end());
}

When instantiated for a_t, b_t std::list.  When called by code that _did
not use the return value_, I had assumed that since the returned value is
not used, the 2 versions would be equivalent.  Instead, (compiling
with -O3), version 2 runs very fast, but version 1 is extremely slow.  Is
it really necessary to construct the returned value even when it is seen
that it is not used?