Re: Issue with LTO/-fwhole-program

2010-06-14 Thread David Brown

On 14/06/2010 06:43, Ian Lance Taylor wrote:

David Brown  writes:


After doing a bit more reading and thinking, it seems to me that
-fwhole-program will be used in most cases where LTO is used.  You use
-flto when compiling each source file, then link them with gcc with
-flto and -fwhole-program.  Except in the case of libraries or other
files which need external symbols, you will want that combination to
generate optimal code.  So if this combination alone, without common
symbols, is going to cause problems, then this would be a much bigger
issue than if it is only triggered by common symbols.


That scenario is fine.

You can look back to see the problematic case posted earlier.  It was
a case where one file was compiled with -flto, one file was compiled
without -flto, both files defined a common symbol with the same name,
the object files were linked together using -flto -fwhole-program, and
the gold plugin was not used.  All elements are essential to recreate
the problem.

Ian



So as far as I understand it, the only problem with issuing accurate 
warnings or errors is that at link-time you don't know if common symbols 
have come from both LTO and non-LTO object files?  Surely then the best 
solution for now, erring on the side of caution, is to issue a warning 
if the compiler/linker sees common symbols of any kind while -flto and 
-fwhole-program are active but the gold plugin is not.  This will, I 
think, only affect a small number of cases (at least for C), and as more 
systems start using gold, it will be even less of an issue.



A side-note of thanks:

LTO is a huge step forward for gcc.  Someone else here posted that it 
had reduced their program run-time by 2.75%.  I believe it has a much 
bigger potential than that - not necessarily because the resulting 
programs will be smaller or faster, but because you no longer have to 
compromise between structure and speed.  As an embedded programmer, 
speed and size are often critical - this means that gory implementation 
details are often exposed in headers (to allow optimal inlining) rather 
than being tucked away in implementation files.  C++ programs should see 
the benefits here immediately - their "setters" and "getters" can be 
moved out of the headers entirely.  Some of the other IPA and 
cross-module optimisations introduced in gcc 4.5, such as re-arranging 
function parameters (-fipa-sra) and interprocedural copy propagation, 
mean that far more general libraries can be written.  Consider something 
as simple as a "setBaudRate(115200)" call.  On an x86, calculating a 
baud rate divisor here is just a few instructions.  But on an 8-bit avr, 
doing a 32-bit division is a long and large process.  Typically the 
embedded programmer will set the baud rate using a #define so that the 
compiler can pre-calculate the divisor.  With gcc 4.5, this will no 
longer be necessary, and the setBaudRate function becomes independent of 
the code that uses it.  Wonderful!


For years embedded gcc fans have had to contend with claims of gcc being 
old-fashioned, and inferior to the big-name commercial compilers.  gcc 
4.5 will go a long way to redressing that.


Many thanks to everyone who has worked on this (and the rest of gcc and 
friends, of course).  You might not have thought about embedded devices 
like the ColdFire, ARM Cortex M3 and the like when you wrote this code. 
 You might not even have /heard/ of the 8-bit AVR or its gcc port.  But 
it is a testament to power of the gcc development model that these 
"small" ports benefit from the hard work done here for the "big" targets.


David




Balla Ingatlaniroda - önkiszolgáló ügyfélszolgálat

2010-06-14 Thread Balla Ingatlan
Üdvözlöm!

Ez egy hozzájárulás kérő levél, melynek elfogadásával Ön hozzáférést kap a 
Balla Ingatlaniroda önkiszolgáló ügyfélszolgálati rendszeréhez.

Az így elérhető szolgáltatásokról részleteket a ballaingatlan.hu oldalon tudhat 
meg.
Regisztráció itt: 
http://ballaingatlan.hu/onkiszolgalo-ugyfelszolgalat/?f=register

Amennyiben levelünkkel zavartuk elnézését kérjük.

Üdvözlettel a Balla Ingatlaniroda csapata nevében Balla Ákos ügyvezető



Szeretnénk tájékoztatni róla, hogy jelen levél nem elektronikus hirdetés, hanem 
hozzájárulás kérő levél, ennek megfelelően semmilyen termékre vagy 
szolgáltatásra vonatkozó ajánlatot nem tartalmaz.

Amennyiben a továbbiakban semmilyen levelet nem szeretne kapni tőlünk, 
kattintson az alábbi linkre, vagy másolja azt böngészője címsorába:
http://ballaingatlan.hu/onkiszolgalo-ugyfelszolgalat/?f=u%7C194784%7Cgcc%40gnu.org



Re: Issue with LTO/-fwhole-program

2010-06-14 Thread Dave Korn
On 14/06/2010 05:43, Ian Lance Taylor wrote:
> David Brown  writes:
> 
>> After doing a bit more reading and thinking, it seems to me that
>> -fwhole-program will be used in most cases where LTO is used.  You use
>> -flto when compiling each source file, then link them with gcc with
>> -flto and -fwhole-program.  Except in the case of libraries or other
>> files which need external symbols, you will want that combination to
>> generate optimal code.  So if this combination alone, without common
>> symbols, is going to cause problems, then this would be a much bigger
>> issue than if it is only triggered by common symbols.
> 
> That scenario is fine.
> 
> You can look back to see the problematic case posted earlier.  It was
> a case where one file was compiled with -flto, one file was compiled
> without -flto, both files defined a common symbol with the same name,
> the object files were linked together using -flto -fwhole-program, and
> the gold plugin was not used.  All elements are essential to recreate
> the problem.

  Given how many standard libraries export common symbols, I wonder if it
won't actually happen quite often.  "nm /usr/lib/*.a | grep ' C '" gives a
fair few hits for me.

cheers,
  DaveK



Re: how to get instruction codes in gcc passes?

2010-06-14 Thread Ilya K
On 13/06/2010 20:57, Ian Lance Taylor wrote:
> Take a look at http://code.google.com/p/mao/ .
> Ian

Thanks, Ian! This project looks very interesting.
I will try to play with it.



On Mon, Jun 14, 2010 at 2:28 AM, Dave Korn  wrote:
>  Or in binutils, LD's relaxation infrastructure might be usable to this end.
>  But I think if you want something so platform dependent as to care about the
> bitpatterns of opcodes, it almost certainly ought to live in the assembler or
> linker rather than then compiler.
>
>    cheers,
>      DaveK

Yes, I agree that compiler maybe is not the best place for my
optimizations. I will try to see if binutils or mao provides better
environment for this developing.
Thanks for pointing out the places where the low-level optimizations
are more suitable.

Ilya K


RE: Issue with LTO/-fwhole-program

2010-06-14 Thread Bingfeng Mei


> -Original Message-
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of
> Ian Lance Taylor
> Sent: 14 June 2010 05:43
> To: David Brown
> Cc: gcc@gcc.gnu.org
> Subject: Re: Issue with LTO/-fwhole-program
> 
> David Brown  writes:
> 
> > After doing a bit more reading and thinking, it seems to me that
> > -fwhole-program will be used in most cases where LTO is used.  You
> use
> > -flto when compiling each source file, then link them with gcc with
> > -flto and -fwhole-program.  Except in the case of libraries or other
> > files which need external symbols, you will want that combination to
> > generate optimal code.  So if this combination alone, without common
> > symbols, is going to cause problems, then this would be a much bigger
> > issue than if it is only triggered by common symbols.
> 
> That scenario is fine.
> 
> You can look back to see the problematic case posted earlier.  It was
> a case where one file was compiled with -flto, one file was compiled
> without -flto, both files defined a common symbol with the same name,
> the object files were linked together using -flto -fwhole-program, and
> the gold plugin was not used.  All elements are essential to recreate
> the problem.

Actually, gold plugin is used in the original example. However, resolution
produced by plugin is bypassed due to a bug-fix by Richard.  Do you have any
comment on that:
http://gcc.gnu.org/ml/gcc-patches/2010-06/msg01116.html

Bingfeng




Re: Issue with LTO/-fwhole-program

2010-06-14 Thread David Brown

On 14/06/2010 11:22, Dave Korn wrote:

On 14/06/2010 05:43, Ian Lance Taylor wrote:

David Brown  writes:


After doing a bit more reading and thinking, it seems to me that
-fwhole-program will be used in most cases where LTO is used.  You use
-flto when compiling each source file, then link them with gcc with
-flto and -fwhole-program.  Except in the case of libraries or other
files which need external symbols, you will want that combination to
generate optimal code.  So if this combination alone, without common
symbols, is going to cause problems, then this would be a much bigger
issue than if it is only triggered by common symbols.


That scenario is fine.

You can look back to see the problematic case posted earlier.  It was
a case where one file was compiled with -flto, one file was compiled
without -flto, both files defined a common symbol with the same name,
the object files were linked together using -flto -fwhole-program, and
the gold plugin was not used.  All elements are essential to recreate
the problem.


   Given how many standard libraries export common symbols, I wonder if it
won't actually happen quite often.  "nm /usr/lib/*.a | grep ' C '" gives a
fair few hits for me.



It seems unlikely that common symbols in a library will be a problem, 
since it is very unlikely that the same common symbols will be defined 
in the library /and/ in user code, and it is very unlikely that 
different parts of the library would be compiled with different -flto 
settings.


However, such libraries would give false (hopefully false!) positives in 
a warning message triggered by the use of common symbols.  Is it 
possible for such a warning message to distinguish between common 
symbols from library files and common symbols from object files?  Would 
that be too risky?




Re: how to get instruction codes in gcc passes?

2010-06-14 Thread Ilya K
On Mon, Jun 14, 2010 at 12:25 AM, Basile Starynkevitch
 wrote:
>
> Why do you want to optimize the generated assembly code? AFAIK, all
> optimization passes in GCC work on some intermediate representation
> which is not the assembly code, and many of them work on Gimple.
> ...
> ... GCC emit textual assembly code. The
> assembler (that is binutils, not GCC) know that nop is 0f 1f.
>
> Cheers.
>
> PS. I might have some details wrong; I am not very familiar with GCC
> back-ends & RTL passes.
>
> --
> Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
> email: basilestarynkevitchnet mobile: +33 6 8501 2359
> 8, rue de la Faiencerie, 92340 Bourg La Reine, France
> *** opinions {are only mines, sont seulement les miennes} ***
>
>

Yes, thanks. I have already seen that GCC does not have the
instruction codes. Nevertheless I have to work on the low-level. At
the back-end. It is a feature of my work :).
So, maybe I'll just switch to inserting my code into binutils, or mao project.

Ilya K


subreg against register allocation?

2010-06-14 Thread Amker.Cheng
Hi :
I am studying IRA right now (GCC4.4.1,mips32 target),
for following piece of code:

long long func(int a, int b)
{
  long long r = (long long)a * (long long)b;

  return r;
}

the asm generated on mips is like:

mult$5,$4
mfhi$5
mflo$2
j   $31
move$3,$5   <--unnecessary move insn

Please note the unnecessary move insn.

RTL list before subreg1 and IRA pass are like:

before subreg1
(insn 7 4 8 2 mult-problem.c:2 (set (reg:DI 196)
(mult:DI (sign_extend:DI (reg/v:SI 195 [ b ]))
(sign_extend:DI (reg/v:SI 194 [ a ] 50 {mulsidi3_32bit} (nil))

(insn 8 7 12 2 mult-problem.c:2 (set (reg:DI 193 [  ])
(reg:DI 196)) 282 {*movdi_32bit} (nil))

(insn 12 8 18 2 mult-problem.c:6 (set (reg/i:DI 2 $2)
(reg:DI 193 [  ])) 282 {*movdi_32bit} (nil))

before IRA
(insn 7 4 25 2 mult-problem.c:2 (set (reg:DI 196)
(mult:DI (sign_extend:DI (reg:SI 5 $5 [ b ]))
(sign_extend:DI (reg:SI 4 $4 [ a ] 50 {mulsidi3_32bit}
(expr_list:REG_DEAD (reg:SI 5 $5 [ b ])
(expr_list:REG_DEAD (reg:SI 4 $4 [ a ])
(nil

(insn 25 7 26 2 mult-problem.c:6 (set (reg:SI 2 $2)
(subreg:SI (reg:DI 196) 0)) 287 {*movsi_internal} (nil))

(insn 26 25 18 2 mult-problem.c:6 (set (reg:SI 3 $3 [+4 ])
(subreg:SI (reg:DI 196) 4)) 287 {*movsi_internal}
(expr_list:REG_DEAD (reg:DI 196)
(nil)))
---end


Seems DImode split  prevents IRA allocating $2/$3 directly
by introducing conflicts between $196 and $2/3 in (insn 25/26).

Wondering whether possible to handle multi-word mode with more accuracy,
in either subreg or IRA pass?

Thanks in advance.

-- 
Best Regards.


Re: Issue with LTO/-fwhole-program

2010-06-14 Thread Richard Guenther
On Mon, Jun 14, 2010 at 9:26 AM, David Brown  wrote:
> On 14/06/2010 06:43, Ian Lance Taylor wrote:
>>
>> David Brown  writes:
>>
>>> After doing a bit more reading and thinking, it seems to me that
>>> -fwhole-program will be used in most cases where LTO is used.  You use
>>> -flto when compiling each source file, then link them with gcc with
>>> -flto and -fwhole-program.  Except in the case of libraries or other
>>> files which need external symbols, you will want that combination to
>>> generate optimal code.  So if this combination alone, without common
>>> symbols, is going to cause problems, then this would be a much bigger
>>> issue than if it is only triggered by common symbols.
>>
>> That scenario is fine.
>>
>> You can look back to see the problematic case posted earlier.  It was
>> a case where one file was compiled with -flto, one file was compiled
>> without -flto, both files defined a common symbol with the same name,
>> the object files were linked together using -flto -fwhole-program, and
>> the gold plugin was not used.  All elements are essential to recreate
>> the problem.
>>
>> Ian
>>
>
> So as far as I understand it, the only problem with issuing accurate
> warnings or errors is that at link-time you don't know if common symbols
> have come from both LTO and non-LTO object files?  Surely then the best
> solution for now, erring on the side of caution, is to issue a warning if
> the compiler/linker sees common symbols of any kind while -flto and
> -fwhole-program are active but the gold plugin is not.  This will, I think,
> only affect a small number of cases (at least for C), and as more systems
> start using gold, it will be even less of an issue.

Well, commons are only one issue - you also have the issue that
for functions GCC thinks are not referenced from outside it will
change the ABI of that function as it suits.  Which will also give
interesting errors if you still reference that from outside.

So there is really no way around getting accurate symbol resolution
information if you want to make -fwhole-program work in all
cases (that it was not designed for).  But then we wouldn't need
-fwhole-program at all but could just automagically bring symbols
local.

Thus - -fwhole-program was exactly designed to give GCC information
about symbols _it can't get access to otherwise_.  And it has (or should
have) a big warning in its documentation.

Richard.

>
> A side-note of thanks:
>
> LTO is a huge step forward for gcc.  Someone else here posted that it had
> reduced their program run-time by 2.75%.  I believe it has a much bigger
> potential than that - not necessarily because the resulting programs will be
> smaller or faster, but because you no longer have to compromise between
> structure and speed.  As an embedded programmer, speed and size are often
> critical - this means that gory implementation details are often exposed in
> headers (to allow optimal inlining) rather than being tucked away in
> implementation files.  C++ programs should see the benefits here immediately
> - their "setters" and "getters" can be moved out of the headers entirely.
>  Some of the other IPA and cross-module optimisations introduced in gcc 4.5,
> such as re-arranging function parameters (-fipa-sra) and interprocedural
> copy propagation, mean that far more general libraries can be written.
>  Consider something as simple as a "setBaudRate(115200)" call.  On an x86,
> calculating a baud rate divisor here is just a few instructions.  But on an
> 8-bit avr, doing a 32-bit division is a long and large process.  Typically
> the embedded programmer will set the baud rate using a #define so that the
> compiler can pre-calculate the divisor.  With gcc 4.5, this will no longer
> be necessary, and the setBaudRate function becomes independent of the code
> that uses it.  Wonderful!
>
> For years embedded gcc fans have had to contend with claims of gcc being
> old-fashioned, and inferior to the big-name commercial compilers.  gcc 4.5
> will go a long way to redressing that.
>
> Many thanks to everyone who has worked on this (and the rest of gcc and
> friends, of course).  You might not have thought about embedded devices like
> the ColdFire, ARM Cortex M3 and the like when you wrote this code.  You
> might not even have /heard/ of the 8-bit AVR or its gcc port.  But it is a
> testament to power of the gcc development model that these "small" ports
> benefit from the hard work done here for the "big" targets.
>
> David
>
>
>


Re: Issue with LTO/-fwhole-program

2010-06-14 Thread Richard Guenther
On Mon, Jun 14, 2010 at 11:27 AM, David Brown  wrote:
> On 14/06/2010 11:22, Dave Korn wrote:
>>
>> On 14/06/2010 05:43, Ian Lance Taylor wrote:
>>>
>>> David Brown  writes:
>>>
 After doing a bit more reading and thinking, it seems to me that
 -fwhole-program will be used in most cases where LTO is used.  You use
 -flto when compiling each source file, then link them with gcc with
 -flto and -fwhole-program.  Except in the case of libraries or other
 files which need external symbols, you will want that combination to
 generate optimal code.  So if this combination alone, without common
 symbols, is going to cause problems, then this would be a much bigger
 issue than if it is only triggered by common symbols.
>>>
>>> That scenario is fine.
>>>
>>> You can look back to see the problematic case posted earlier.  It was
>>> a case where one file was compiled with -flto, one file was compiled
>>> without -flto, both files defined a common symbol with the same name,
>>> the object files were linked together using -flto -fwhole-program, and
>>> the gold plugin was not used.  All elements are essential to recreate
>>> the problem.
>>
>>   Given how many standard libraries export common symbols, I wonder if it
>> won't actually happen quite often.  "nm /usr/lib/*.a | grep ' C '" gives a
>> fair few hits for me.
>>
>
> It seems unlikely that common symbols in a library will be a problem, since
> it is very unlikely that the same common symbols will be defined in the
> library /and/ in user code, and it is very unlikely that different parts of
> the library would be compiled with different -flto settings.
>
> However, such libraries would give false (hopefully false!) positives in a
> warning message triggered by the use of common symbols.  Is it possible for
> such a warning message to distinguish between common symbols from library
> files and common symbols from object files?  Would that be too risky?

Commons between shared libraries and a program can't work.

Richard.


Re: [inliner] g++ -O[123] generates undefined symbol

2010-06-14 Thread Дмитрий Дьяченко
Done. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44535

Thanks.
Dmitry

2010/6/14 Ian Lance Taylor :
> Дмитрий Дьяченко  writes:
>
>> Trunk g++/x86/160655 with -O0 compile test w/o errors, but with
>> -O[123] generates undefined symbol
>
>> Need i file a PR?
>
> It certainly looks like a bug.  Please do open a bug report.  Thanks.
>
> Ian
>


using GCC to compile games 4 game consoles

2010-06-14 Thread tajiya1
Hello GCC,

I would like to inquire about an open source project of yours called GCC.

I read that it's a cross platform complier for a number of programs.

My question is does it work with open source game engines in complying
games for game console platforms (such as Xbox 360, Playstation3, Wii,
Nintendo DSi, PSPgo and etc) and mobile platforms (like iPhone, iPad and
Android for example)

Thanks a lot,

My best regards,

Tajiya Baba,
Head Director of EYI Publications Company









using GCC to compile games 4 game consoles

2010-06-14 Thread tajiya1
Hello GCC,

I would like to inquire about an open source project of yours called GCC.

I read that it's a cross platform complier for a number of programs.

My question is does it work with open source game engines in complying
games for game console platforms (such as Xbox 360, Playstation3, Wii,
Nintendo DSi, PSPgo and etc) and mobile platforms (like iPhone, iPad and
Android for example)

Thanks a lot,

My best regards,

Tajiya Baba,
Head Director of EYI Publications Company







Broken trunk (160732)

2010-06-14 Thread Bingfeng Mei
Hi,
I just updated from last week's version to 160732. It seems broken due to the 
latest changes in c-family directory.
./../trunk/gcc/c-family/c-cppbuiltin.c: In function 'void 
builtin_define_with_hex_fp_value(const char*, tree_node*, int, const char*, 
const char*, const char*)':
../../trunk/gcc/c-family/c-cppbuiltin.c:1025:45: error: invalid conversion from 
'int' to 'cpp_builtin_type'

My configuration is:
CFLAGS="-g -O0" ../trunk/configure 
--prefix=/projects/firepath/tools/work/bmei/install-x86 
--with-mpfr=/projects/firepath/tools/work/bmei/packages/mpfr/2.4.1/x86-64 
--with-gmp=/projects/firepath/tools/work/bmei/packages/gmp/4.3.0/x86-64 
--with-mpc=/projects/firepath/tools/work/bmei/packages/mpc/0.8.1/x86-64 
--with-elf=/projects/firepath/tools/work/bmei/packages/libelf/x86-64 
--enable-plugins --enable-lto --enable-gold --enable-languages=c,c++ 
--enable-build-with-cxx --disable-werror --disable-bootstrap


Cheers,
Bingfeng



Re: Scheduling x86 dispatch windows

2010-06-14 Thread Michael Matz
Hi,

On Sun, 13 Jun 2010, H.J. Lu wrote:

> > We shouldn't turn GNU x86 assembler into an optimizing assembler. Next 
> > people may ask assembler to remove redundant instructions, ...

Well, but currently nobody is asking for such thing, right?

> > Right now, when something goes wrong, people don't have to debug 
> > assembler since it is very unlikely that the problem is in assembler. 
> > When assembler starts to make changes to assembly input, we have 
> > another place where a bug may be introduced.

But that's the case also right now.  align directives are one example for 
the assembler not emitting a one-to-one mapping, jump instructions are 
another.

> >> The essence is we want to insert prefixes (as well as nops) according 
> >> to certain rules known at encoding time.  The mechanism implementing 
> >> these rules can be abstracted (table driven?) and could be applicable 
> >> to any hardware having similar features.
> >
> > Can you implement them with new directives/pseudo instructions?
> >
> 
> I think you should take a look at
> 
> http://code.google.com/p/mao/

I find the direction this discussion takes a bit bizarre.  Quentins 
suggestions were grounded in the way GCC works.  It emits text, and 
expects the assembler to transform this into binary blobs.  Changing this 
fundmental property is so much work that it isn't sensible to suggest that 
as alternative to the proposal.

Also GCC prefers to use GNU as.  Suggesting to use a different as' also 
doesn't read realistic (or even desirable) in my book, at least not on 
platforms where other as' aren't supported right now.  Neither does a 
post-processing tool seem desirable, as we want to generate fast code by 
default.

Therefore the only two realistic (IMO) possibilities are to either change 
GNU as to ensure the hw constraints are observed, or to do the same change 
in GCC.

Doing the change in GNU as has the advantage that all insn lengths are 
available without any work, i.e. it will handle e.g. inline asm; and that 
relaxation also is implemented just fine (it exists already in order to 
decide which jump form it's going to use); it has a high chance of always 
emitting the correct sequences.  It has the disadvantage that GNU as would 
emit (no-op) prefixes that the asm author didn't write.

Doing the change in GCC has the advantage that it would know about this 
change in instruction size (and therefore also could calculate sizes of 
jumps more correctly).  It has at least the disadvantage to need to do the 
tedious job of ensuring all insn lengths are correct, which by necessicity 
won't be done for inline asm; even ignoring inline asm it's known to 
quickly bit-rot (despite Jakubs heroic efforts).  From that follows that 
it has a somewhat higher chance of emitting slow sequences.

I don't see realistic and desirable other options.  For completeness 
considerations (inline asm) I think changing GNU as is the better choice.


Ciao,
Michael.

Re: Scheduling x86 dispatch windows

2010-06-14 Thread Jakub Jelinek
On Mon, Jun 14, 2010 at 03:14:37PM +0200, Michael Matz wrote:
> Doing the change in GNU as has the advantage that all insn lengths are 
> available without any work, i.e. it will handle e.g. inline asm; and that 
> relaxation also is implemented just fine (it exists already in order to 
> decide which jump form it's going to use); it has a high chance of always 
> emitting the correct sequences.  It has the disadvantage that GNU as would 
> emit (no-op) prefixes that the asm author didn't write.
> 
> Doing the change in GCC has the advantage that it would know about this 
> change in instruction size (and therefore also could calculate sizes of 
> jumps more correctly).  It has at least the disadvantage to need to do the 
> tedious job of ensuring all insn lengths are correct, which by necessicity 
> won't be done for inline asm; even ignoring inline asm it's known to 
> quickly bit-rot (despite Jakubs heroic efforts).  From that follows that 
> it has a somewhat higher chance of emitting slow sequences.
> 
> I don't see realistic and desirable other options.  For completeness 
> considerations (inline asm) I think changing GNU as is the better choice.

Doing it in GNU as without the user asking for it might break code though
(e.g. when code has jmp .+16 or something similar).
So, IMHO it is to be done in GNU as, it should be only requested through
directives.  Say directives that set start and end of an insn block which
would request padding the whole block to certain alignment and CPU to
optimize for while doing that.  This wouldn't be useful just for this future AMD
CPU, but e.g. could insert a prefix or two here and there to ensure loop
start is already aligned at 16 byte (or some other) boundary without having
to add a nop there.  Of course when optimizing for some CPUs the strategy
could be not to include any prefixes in the sequence, just add normal
aligning nop at the end.

Jakub


Re: Broken trunk (160732)

2010-06-14 Thread Joern Rennecke

Quoting Bingfeng Mei :


Hi,
I just updated from last week's version to 160732. It seems broken   
due to the latest changes in c-family directory.



--enable-build-with-cxx --disable-werror --disable-bootstrap


That's PR 44512.


Re: Patch pinging

2010-06-14 Thread Manuel López-Ibáñez
On 10 June 2010 22:05, Quentin Neill  wrote:
> On Tue, Jun 8, 2010 at 6:30 AM, Jonathan Wakely  wrote:
>> On 7 June 2010 22:43, Ian Lance Taylor wrote:
>>>
>>> The patch tracker (http://gcc.gnu.org/wiki/GCC_Patch_Tracking) is not
>>> currently operating.
>>>
>>> Would anybody like to volunteer to get it working again?
>>
>> I'm not volunteering, but I might look into it one day
>>
>> If dberlin doesn't still have the code, it shouldn't be too hard...
>>
>> ... a script which periodically crawls the gcc-patches archive might 
>> suffice...
>
> I have a python script which crawls, caches, and parses the gcc-cvs
> (and binutils-cvs) email archive pages.  I wrote it to help another
> script that correlates patch revisions in a branch (where the
> Changelog refers to revisions on the trunk) back to the useful
> Changelog entries in the trunk.
>
> I could submit that to contrib, it could be modified to scrape most of
> the information above into a single monthly report.
>
> Any interest?

I don't think such a script would be better than
http://patchwork.ozlabs.org/project/gcc/list/

What we would need is some way to detect that patches have been
committed. Otherwise that list will grow uncontrollably very fast.

Cheers,

Manuel.


Re: Patch pinging

2010-06-14 Thread NightStrike
On Mon, Jun 14, 2010 at 12:05 PM, Manuel López-Ibáñez
 wrote:
> What we would need is some way to detect that patches have been
> committed. Otherwise that list will grow uncontrollably very fast.

Imagine that :)


Re: using GCC to compile games 4 game consoles

2010-06-14 Thread Andrew Pinski
On Mon, Jun 14, 2010 at 4:50 AM,   wrote:
> Hello GCC,
>
> I would like to inquire about an open source project of yours called GCC.
>
> I read that it's a cross platform complier for a number of programs.
>
> My question is does it work with open source game engines in complying
> games for game console platforms (such as Xbox 360, Playstation3, Wii,
> Nintendo DSi, PSPgo and etc) and mobile platforms (like iPhone, iPad and
> Android for example)

GCC is used inside Sony to compile a lot of the playstation 3 code.
It is (and still is) __an__ official compiler to compile Playstation 3
games.  The majority of games released for the PS3 between 2006 and
late 2008 are compiled with GCC.

GCC is also the official compiler to compile for Android and Palm's
WebOS.  Both of those are Linux kernel based.

Thanks,
Andrew Pinski


Re: Issue with LTO/-fwhole-program

2010-06-14 Thread Ian Lance Taylor
"Bingfeng Mei"  writes:

> Actually, gold plugin is used in the original example. However, resolution
> produced by plugin is bypassed due to a bug-fix by Richard.  Do you have any
> comment on that:
> http://gcc.gnu.org/ml/gcc-patches/2010-06/msg01116.html

Sorry, I missed that.  There is a bug in gold with reporting common
symbol resolution.  I think it's http://gcc.gnu.org/PR44149 .

Ian


GCC 4.3.4 is casting my QImode vars to SImode for libcall

2010-06-14 Thread Paulo J. Matos
Hello,

In gcc4.3.4, for my architecture: 16 BITS_PER_UNIT, 1 UNIT_PER_WORD,
with INT_TYPE_SIZE = 16 and FLOAT_TYPE_SIZE = 32, then an unsigned int
is QImode and a float is HFmode.

However with:
float uitof(unsigned int x) { return x; }

I get a call to the function __floatunsihf. Shouldn't this be __floatunqihf?
Even if I provide a floatunqihf with set_conv_libfunc this is not
used. The problem with using __floatunsihf is that it is turning a
QImod into an SImode (64bits in my case) and then convert to an HFmode
(32bits). While QImode->HFmode is an extension 16bit->32bit, SImode ->
HFmode is a truncation 64bit->32bit and not ideal at all.

Any reason behind this or can I instruct GCC not to convert the
unsigned int from QImode to SImode?

Cheers,

-- 
PMatos


Re: Issue with LTO/-fwhole-program

2010-06-14 Thread Ian Lance Taylor
Richard Guenther  writes:

> Commons between shared libraries and a program can't work.

Technically speaking, shared libraries never have common symbols.

They can have defined symbols which are labelled as, in essence,
"formerly common," and those can be made to work correctly under
certain restrictions, namely that the run time shared library does not
have a larger version of the symbol than the link time shared library.

Ian


Re: subreg against register allocation?

2010-06-14 Thread Ian Lance Taylor
"Amker.Cheng"  writes:

> Wondering whether possible to handle multi-word mode with more accuracy,
> in either subreg or IRA pass?

Yes, it is possible.  What you need to do is to write a split which
turns the mult:DI insn into an insn which sets two separate subregs.
The values for the two subregs will be written as shifts and truncates
of the mult:DI; see, e.g., mul3_highpart in i386.md.  If you
do that, then with luck the second lower subreg pass will be able to
pull apart the values, and IRA will allocate them independently.  You
want to write it as a split so that the RTL CSE and combine passes see
the mult:DI, in case they can do anything with it.

Ian


Re: GCC 4.3.4 is casting my QImode vars to SImode for libcall

2010-06-14 Thread Ian Lance Taylor
"Paulo J. Matos"  writes:

> In gcc4.3.4, for my architecture: 16 BITS_PER_UNIT, 1 UNIT_PER_WORD,
> with INT_TYPE_SIZE = 16 and FLOAT_TYPE_SIZE = 32, then an unsigned int
> is QImode and a float is HFmode.
>
> However with:
> float uitof(unsigned int x) { return x; }
>
> I get a call to the function __floatunsihf. Shouldn't this be __floatunqihf?

Yes.  Something is wrong somewhere, but I don't know where.

Ian


Re: GCC 4.3.4 is casting my QImode vars to SImode for libcall

2010-06-14 Thread Joseph S. Myers
On Mon, 14 Jun 2010, Paulo J. Matos wrote:

> Hello,
> 
> In gcc4.3.4, for my architecture: 16 BITS_PER_UNIT, 1 UNIT_PER_WORD,
> with INT_TYPE_SIZE = 16 and FLOAT_TYPE_SIZE = 32, then an unsigned int
> is QImode and a float is HFmode.

To attempt such a port, being an expert in GCC internals is a good idea as 
there will be many such issues you need to debug yourself, and you will 
then need to assess what the right general solutions are that do not 
adversely affect other targets but make GCC more general in terms of what 
targets it supports.

Every hardcoded reference to a mode other than QImode in the 
target-independent compiler is suspicious and needs investigating if you 
are doing such a port.  In this case, the SImode references in 
expand_float are likely the problem.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: subreg against register allocation?

2010-06-14 Thread Bernd Schmidt
On 06/14/2010 07:58 PM, Ian Lance Taylor wrote:
> "Amker.Cheng"  writes:
> 
>> Wondering whether possible to handle multi-word mode with more accuracy,
>> in either subreg or IRA pass?
> 
> Yes, it is possible.  What you need to do is to write a split which
> turns the mult:DI insn into an insn which sets two separate subregs.

Is it valid to have an insn set two different subregs of the same pseudo
in parallel?  I have the same problem on ARM, but I couldn't convince
myself that writing such a split was safe so I'm currently trying to
solve the problem in the register allocator.


Bernd


Re: Patch pinging

2010-06-14 Thread Quentin Neill
On Mon, Jun 14, 2010 at 11:05 AM, Manuel López-Ibáñez
 wrote:
> On 10 June 2010 22:05, Quentin Neill  wrote:
>> I have a python script which crawls, caches, and parses the gcc-cvs
>> (and binutils-cvs) email archive pages.  I wrote it to help another
>> script that correlates patch revisions in a branch (where the
>> Changelog refers to revisions on the trunk) back to the useful
>> Changelog entries in the trunk.
>>
>> I could submit that to contrib, it could be modified to scrape most of
>> the information above into a single monthly report.
>>
>> Any interest?
>
> I don't think such a script would be better than
> http://patchwork.ozlabs.org/project/gcc/list/
>
> What we would need is some way to detect that patches have been
> committed. Otherwise that list will grow uncontrollably very fast.
> Cheers,
> Manuel.

I guess what I was thinking was the script would crawl the patch
postings and then the patch submissions and then print a correlation
report.

Once the correlation is working well enough, the outstanding patch
postings would be the only thing in the list.

Does the patchwork client that interacts with
http://patchwork.ozlabs.org/project/gcc/list/ do any correlation?
-- 
Quentin Neill


Re: subreg against register allocation?

2010-06-14 Thread Ian Lance Taylor
Bernd Schmidt  writes:

> On 06/14/2010 07:58 PM, Ian Lance Taylor wrote:
>> "Amker.Cheng"  writes:
>> 
>>> Wondering whether possible to handle multi-word mode with more accuracy,
>>> in either subreg or IRA pass?
>> 
>> Yes, it is possible.  What you need to do is to write a split which
>> turns the mult:DI insn into an insn which sets two separate subregs.
>
> Is it valid to have an insn set two different subregs of the same pseudo
> in parallel?  I have the same problem on ARM, but I couldn't convince
> myself that writing such a split was safe so I'm currently trying to
> solve the problem in the register allocator.

Well, as you know, subregs have two meanings which look similar but
are in fact entirely different.  It's valid to set subregs of the same
pseudo in parallel if the subregs represent different hard registers.
It's not valid if the subregs represent different pieces of the same
hard register.

Ian


Re: subreg against register allocation?

2010-06-14 Thread Bernd Schmidt
On 06/15/2010 12:06 AM, Ian Lance Taylor wrote:

> Well, as you know, subregs have two meanings which look similar but
> are in fact entirely different.  It's valid to set subregs of the same
> pseudo in parallel if the subregs represent different hard registers.
> It's not valid if the subregs represent different pieces of the same
> hard register.

Are you aware of any examples of this in the compiler?  The explanation
is of course plausible, but do we know that we handle this correctly
everywhere?


Bernd


Re: subreg against register allocation?

2010-06-14 Thread Ian Lance Taylor
Bernd Schmidt  writes:

> On 06/15/2010 12:06 AM, Ian Lance Taylor wrote:
>
>> Well, as you know, subregs have two meanings which look similar but
>> are in fact entirely different.  It's valid to set subregs of the same
>> pseudo in parallel if the subregs represent different hard registers.
>> It's not valid if the subregs represent different pieces of the same
>> hard register.
>
> Are you aware of any examples of this in the compiler?  The explanation
> is of course plausible, but do we know that we handle this correctly
> everywhere?

Hmmm, you could be right.  I wrote and tested some examples when
working on lower-subreg, but I never committed them.  The current
define_splits in i386.md which can do parallel sets only run if
reload_completed is true, when simplify_gen_subreg will return
different hard registers.  While I don't know of any reason that it
wouldn't work, I guess I don't know for sure that it is safe.

Ian


Re: subreg against register allocation?

2010-06-14 Thread Mark Mitchell
Ian Lance Taylor wrote:

>>> Well, as you know, subregs have two meanings which look similar but
>>> are in fact entirely different.  It's valid to set subregs of the same
>>> pseudo in parallel if the subregs represent different hard registers.

>> Are you aware of any examples of this in the compiler?  The explanation
>> is of course plausible, but do we know that we handle this correctly
>> everywhere?
> 
> Hmmm, you could be right.  

Of course, if we think this *should* work in principle (i.e., that it
makes sense for this to be meaningful RTL), and we don't know that it
doesn't work, it's reasonable to put this into GCC, changing the
documentation to specify the semantics of this form of RTL, and then
fixing any bugs as they occur.

Thanks,

-- 
Mark Mitchell
CodeSourcery
m...@codesourcery.com
(650) 331-3385 x713


[RFC] Cleaning up the pass manager

2010-06-14 Thread Diego Novillo
I have been thinking about doing some cleanups to the pass manager.
The goal would be to have the pass manager be the central driver of
every action done by the compiler.  In particular, the front ends
should make use of it and the callgraph manager, instead of the
twisted interactions we have now.

Additionally, I would like to (at some point) incorporate some/most of
the functionality provided by ICI
(http://ctuning.org/wiki/index.php/CTools:ICI).  I'm not advocating
for integrating all of ICI, but leave enough hooks so such
experimentations are easier to do.

Initially, I'm going for some low hanging fruit:

- Fields properties_required, properties_provided and
properties_destroyed should Mean Something other than asserting
whether they exist.
- Whatever doesn't exist before a pass, needs to be computed.
- Pass scheduling can be done by simply declaring a pass and
presenting it to the pass manager.  The property sets should be enough
for the PM to know where to schedule a pass.
- dump_file and dump_flags are no longer globals.

Are there any particular pain points that people are currently
experiencing that fit this?


Thanks.  Diego.


Re: [RFC] Cleaning up the pass manager

2010-06-14 Thread Joern Rennecke

Quoting Diego Novillo :


- Fields properties_required, properties_provided and
properties_destroyed should Mean Something other than asserting
whether they exist.
- Whatever doesn't exist before a pass, needs to be computed.
- Pass scheduling can be done by simply declaring a pass and
presenting it to the pass manager.  The property sets should be enough
for the PM to know where to schedule a pass.


That might be possible for some passes, but if you have a very generic  
transformation pass that doesn't require or produce any properties,

you really have to tell the pass manager where you want it to go.

Also, for pass reordering driven by machine learning, the idea is that
you create a pass ordering and present it to the pass manager.
So here it seems useful to have query capabilities to check pass  
orderings.  I.e.: - generate a list pass and ask the pass manager if the

properties are consistent.
  - ask if a single pass addition / removal at a specific place is valid.
for addition, might optionally allow other passes to be inserted to
generate / regenerate properties, and have an option to install
the fixed change.
  - For more sophisticated search algorithms, I suppose it is best if
the plugin can look at the properties in the passes directly, so the
necessary definitions should continue to be in the plugin include
directory.


Re: subreg against register allocation?

2010-06-14 Thread Amker.Cheng
Thanks for explanation.

here are three more questions
1 , If I am talking the right thing, there are two insns like
   "*mulsi3_1" and "*smulsi3_highpart_insn",
 which set two parts of DImode pseudo regs of DImode mult.

Since both parts pf result are used in the original example,
I am not sure how to make split pattern to handle this case
without generating two duplicate mult insns in parallel.

2 , If I could set the two parts of result in parallel insn, I also have to
handle mips specific constraints in this case, i.e, constraints
for HI/LO registers.
Unfortunately, There is no "h" constraint now according to patch
http://gcc.gnu.org/ml/gcc-patches/2008-05/msg01750.html

It is not possible to write hi reg without clobbering the lo reg now,
How should I handle this?

3 , Since I am studying IRA right now, I am very curious about whether
possible to solve this in IRA. e.g, by shrinking live ranges
of multi-word pseudo regs?

PS, maybe I am talking gibberish, Sorry If not clear enough.
Thanks.
-- 
Best Regards.


Re: subreg against register allocation?

2010-06-14 Thread Jeff Law

On 06/14/10 11:58, Ian Lance Taylor wrote:

"Amker.Cheng"  writes:

   

Wondering whether possible to handle multi-word mode with more accuracy,
in either subreg or IRA pass?
 

Yes, it is possible.  What you need to do is to write a split which
turns the mult:DI insn into an insn which sets two separate subregs.
The values for the two subregs will be written as shifts and truncates
of the mult:DI; see, e.g.,mul3_highpart in i386.md.  If you
do that, then with luck the second lower subreg pass will be able to
pull apart the values, and IRA will allocate them independently.  You
want to write it as a split so that the RTL CSE and combine passes see
the mult:DI, in case they can do anything with it.
   
Note that lower-subreg is rather conservative when determining what 
subregs to lower, particularly when the pseudo appears in different 
modes (ie, some accesses are via SUBREGs, others are naked REGs).  So 
this approach may not necessarily work.


jeff


Re: subreg against register allocation?

2010-06-14 Thread Joern Rennecke

Quoting Ian Lance Taylor :


Hmmm, you could be right.  I wrote and tested some examples when
working on lower-subreg, but I never committed them.  The current
define_splits in i386.md which can do parallel sets only run if
reload_completed is true, when simplify_gen_subreg will return
different hard registers.  While I don't know of any reason that it
wouldn't work, I guess I don't know for sure that it is safe.


On targets with different register sizes, subregs are not useful before
reload to describe what happens with the smaller registers; because of
the uncertainty of the register allocation, the semantics are still
undefined.
It would really be useful if we replaced STRICT_LOW_PART with something
like STRICT_SUBREG to refer to parts of a REG like SUBREG, but always
leave the rest of the register alone.