Re: MIPS: Changing the PC stored from a "and link" instruction

2011-02-17 Thread Ian Lance Taylor
"Brandon H. Dwiel"  writes:

> I would like to make the changes necessary so that the compiler expects the 
> PC of the
> instruction directly after the branch to be put in the $ra register.
>
> I cannot locate where it is specified that PC+8 of an "and link" instruction 
> is to
> be put in the $ra so that I may change it.

It's not specified in that way.  Instead, it's specified that jalr has a
delay slot.  See the uses of define_delay in gcc/config/mips/mips.md.

Ian


about auto-vectorization and run time stack align

2011-02-17 Thread WANG.Jiong

Hi all:

   Recently I found some tricky bugs in a private target which supports 
SIMD instructions.


   The reason is auto-vectorization will try to use those SIMD 
instructions which require data address be 32 bytes aligned.


   So a target specific option, which enable generating runtime stack 
check prologue, like x86's -mstackrealign is always needed if we want to 
make sure -ftree-vectorize work corrently?

   or are there something else I have missed?

   my gcc version is 4.3.5

   Thanks in advance.

---
Best,
KwongYuan


Re: PATCH committed: 64-bit Apple Objective-C runtime support

2011-02-17 Thread Mike Stump
On Feb 17, 2011, at 4:09 PM, Nicola Pero wrote:
> This patch is not me - it's by Iain Sandoe. :-)

Thanks for chipping in and helping out.  I'm excited at having a Objective-C 
compiler that works again on darwin.

That said, if people have any Objective-C codes, feel free to beat on them and 
let us know what you find.  We're interested in regressions first and foremost, 
after that functionality on 64-bit darwin.  We know about PCH not working well 
in some situations, so, be prepared to turn that off.


RE: [MIPS] Test case dspr2-MULT is failed

2011-02-17 Thread Fu, Chao-Ying
Mingjie Xing wrote:
> 2011/2/18 Fu, Chao-Ying :
> > I think your analysis is correct.  We should just delete 
> mips_order_regs_for_local_alloc()
> > in mips.c and delete ADJUST_REG_ALLOC_ORDER in mips.h.
> > Then, 3 accumulators can be used in dspr2-MULT.c and 
> dspr2-MULTU.c now.  Thanks!
> 
> /* ADJUST_REG_ALLOC_ORDER is a macro which permits reg_alloc_order
>to be rearranged based on a particular function.  On the mips16, we
>want to allocate $24 (T_REG) before other registers for
>instructions for which it is possible.  */
> 
> #define ADJUST_REG_ALLOC_ORDER mips_order_regs_for_local_alloc ()
> 
> I'm just wondering if it Is appropriate to simply remove
> ADJUST_REG_ALLOC_ORDER considering its comment.

  Ok.  Need to test if allocating $24 first is still better in MIPS16 under IRA.
If yes, we should update mips_order_regs_for_local_alloc() for MIPS16 only
(eg: exchange $24 and $1), as the default register order is as follows.
Ex:
/* We generally want to put call-clobbered registers ahead of
   call-saved ones.  (IRA expects this.)  */

#define REG_ALLOC_ORDER \
{ /* Accumulator registers.  When GPRs and accumulators have equal  \
 cost, we generally prefer to use accumulators.  For example,   \
 a division of multiplication result is better allocated to LO, \
 so that we put the MFLO at the point of use instead of at the  \
 point of definition.  It's also needed if we're to take advantage  \
 of the extra accumulators available with -mdspr2.  In some cases,  \
 it can also help to reduce register pressure.  */  \
  64, 65,176,177,178,179,180,181,   \
  /* Call-clobbered GPRs.  */   \
  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,\
  24, 25, 31,   \
...

Regards,
Chao-ying


Re: [MIPS] Test case dspr2-MULT is failed

2011-02-17 Thread Mingjie Xing
2011/2/18 Fu, Chao-Ying :
> I think your analysis is correct.  We should just delete 
> mips_order_regs_for_local_alloc()
> in mips.c and delete ADJUST_REG_ALLOC_ORDER in mips.h.
> Then, 3 accumulators can be used in dspr2-MULT.c and dspr2-MULTU.c now.  
> Thanks!

/* ADJUST_REG_ALLOC_ORDER is a macro which permits reg_alloc_order
   to be rearranged based on a particular function.  On the mips16, we
   want to allocate $24 (T_REG) before other registers for
   instructions for which it is possible.  */

#define ADJUST_REG_ALLOC_ORDER mips_order_regs_for_local_alloc ()

I'm just wondering if it Is appropriate to simply remove
ADJUST_REG_ALLOC_ORDER considering its comment.

Regards,
Mingjie


Re: RFC: A new MIPS64 ABI

2011-02-17 Thread David Daney

On 02/14/2011 12:29 PM, David Daney wrote:

Background:

Current MIPS 32-bit ABIs (both o32 and n32) are restricted to 2GB of
user virtual memory space. This is due the way MIPS32 memory space is
segmented. Only the range from 0..2^31-1 is available. Pointer
values are always sign extended.

Because there are not already enough MIPS ABIs, I present the ...

Proposal: A new ABI to support 4GB of address space with 32-bit
pointers.

The proposed new ABI would only be available on MIPS64 platforms. It
would be identical to the current MIPS n32 ABI *except* that pointers
would be zero-extended rather than sign-extended when resident in
registers. In the remainder of this document I will call it
'n32-big'. As a result, applications would have access to a full 4GB
of virtual address space. The operating environment would be
configured such that the entire lower 4GB of the virtual address space
was available to the program.


At a low level here is how it would work:

1) Load a pointer to a register from memory:

n32:
LW $reg, offset($reg)

n32-big:
LWU $reg, offset($reg)

2) Load an address constant into a register:

n32:
LUI $reg, high_part
ORI $reg, low_part


That is not reality.  Really it is:

LUI $reg, R_MIPS_HI16
ADDIU $reg, R_MIPS_LO16




n32-big:
ORI $reg, high_part
DSLL $reg, $reg, 16
ORI $reg, low_part



This one would really be:

ORI $reg, R_MIPS_HI16
DSLL $reg, $reg, 16
ADDIU $reg, R_MIPS_LO16




Q: What would have to change to make this work?

o A new ELF header flag to denote the ABI.

o Linker support to use proper library search paths, and linker scrips
to set the INTERP program header, etc.

o GCC has to emit code for the new ABI.

o Could all existing n32 relocation types be used? I think so.

o Runtime libraries would have to be placed in a new location
(/lib32big, /usr/lib32big ...)

o The C library's ld.so would have to use a distinct LD_LIBRARY_PATH
for n32-big code.

o What would the Linux system call interface be? I would propose
using the existing Linux n32 system call interface. Most system
calls would just work. Some, that pass pointers in in-memory
structures, might require kernel modifications (sigaction() for
example).





Re: x32 psABI draft version 0.2

2011-02-17 Thread Jakub Jelinek
On Thu, Feb 17, 2011 at 11:49:56PM +0100, Jan Hubicka wrote:
> The blog claims
> Architecture  libxul.so size  relocations size%
> x86   21,869,684  1,884,864   8.61%
> x86-6429,629,040  5,751,984   19.41%
> 
> The REL encoding also grows twice for 64bit target?
> 
> > REL.  There might be better ways how to get the numbers down.
> 
> These are difficult since they mostly come from vtables and we need to be
> pretty smart to optimize vtable out completely.  Mozilla recently started to
> use elfhack (in official builds) that is sort of their own dynamic linker
> handling PC relative relcoations only.  Pretty ugly IMO but they claim 16%
> savings on x86-64, 6% on x86

By better ways I meant create new relocations for relative relocs that would
be even more compact (assuming we can't or don't want to change the fact
that vtables contain pointers instead of pc relative pointers and assuming
Mozilla doesn't want to change implementation language to something saner
than C++ ;) ).
On my libxul.so I see:
 0x6ff9 (RELACOUNT)  161261
Relocation section '.rela.dyn' at offset 0x75a10 contains 186467 entries:
Relocation section '.rela.plt' at offset 0x4ba358 contains 4722 entries:
so all that it actually matters there are relative relocations.
So one way to cut down the size of .rela.dyn section would be a relocation
like
R_X86_64_RELATIVE_BLOCK where applying such a relocation with r_offset O and
r_addend N would be:
uint64_t *ptr = O;
for (i = 0; i < N; i++)
  ptr[i] += bias;
Then e.g.
003ec6d86008  0008 R_X86_64_RELATIVE
   003ec5aef3f3
003ec6d86010  0008 R_X86_64_RELATIVE
   003ec5af92f6
003ec6d86018  0008 R_X86_64_RELATIVE
   003ec5b06d17
003ec6d86020  0008 R_X86_64_RELATIVE
   003ec5b1dc5f
003ec6d86028  0008 R_X86_64_RELATIVE
   003ec5b1edaf
003ec6d86030  0008 R_X86_64_RELATIVE
   003ec5b27358
003ec6d86038  0008 R_X86_64_RELATIVE
   003ec5b30f9f
003ec6d86040  0008 R_X86_64_RELATIVE
   003ec5b3317d
003ec6d86048  0008 R_X86_64_RELATIVE
   003ec5b34479
could be represented as:
003ec6d86008  00MN R_X86_64_RELATIVE_BLOCK  
   0009
I see many hundreds of consecutive R_X86_64_RELATIVE relocs in libxul.so, though
of course it would need much better analysis over larger body of code.

In most programs if the library is prelinked all relative relocs are skipped
and .rela.dyn for them doesn't need to be even paged in, but Mozilla is quite
special in that it one of the most common security relevant packages and thus
wants randomization, but is linked against huge libraries, so the question is
if Mozilla is the right candidate to drive our decisions on.

Another alternative to compress relative relocations would be an indirect
relative relocation, which would give you in r_offset address of a block of 
addresses
and r_addend the size of that block, and the block would just contain offsets
on which words need to be += bias.  Then, instead of changing RELA to REL to
save 8 bytes from 24 you'd save 16 bytes from those 24 (well, for x32 half of 
that).

Jakub


RE: [MIPS] Test case dspr2-MULT is failed

2011-02-17 Thread Fu, Chao-Ying
Chung-Lin Tang wrote:
> I analyzed this testcase regression a while earlier; the 
> direct cause of
> this is due to mips_order_regs_for_local_alloc(), which now serves as
> MIPS' ADJUST_REG_ALLOC_ORDER macro.
> 
> The mips_order_regs_for_local_alloc() function seems to be written for
> the old local-alloc.c, still left as the deprecated
> ORDER_REGS_FOR_LOCAL_ALLOC macro after the transition to IRA (actually
> not called at all during then), and relatively recently 
> 'revived' after
> a patch by Bernd that created the ADJUST_REG_ALLOC_ORDER 
> macro went in.
> 
> So you have a local-alloc.c heuristic working in IRA, which seemed to
> cause these regressions.
> 
> Removing mips_order_regs_for_local_alloc() will let this 
> testcase pass;
> of course the real fix should be to review the MIPS 
> reg-ordering logic,
> left for you MIPS people...

I think your analysis is correct.  We should just delete 
mips_order_regs_for_local_alloc()
in mips.c and delete ADJUST_REG_ALLOC_ORDER in mips.h.
Then, 3 accumulators can be used in dspr2-MULT.c and dspr2-MULTU.c now.  Thanks!
Ex:
/* Test MIPS32 DSP REV 2 MULT instruction.  Tune for a CPU that has
   pipelined mult.  */
/* { dg-do compile } */
/* { dg-options "-mgp32 -mdspr2 -O2 -ffixed-hi -ffixed-lo -mtune=74kc" } */

/* { dg-final { scan-assembler "\tmult\t" } } */
/* { dg-final { scan-assembler "ac1" } } */
/* { dg-final { scan-assembler "ac2" } } */
/* { dg-final { scan-assembler "ac3" } } */

typedef long long a64;

NOMIPS16 a64 test (a64 *a, int *b, int *c)
{
  a[0] = (a64) b[0] * c[0];
  a[1] = (a64) b[1] * c[1];
  a[2] = (a64) b[2] * c[2];
}

Ex:
/* Test MIPS32 DSP REV 2 MULTU instruction.  Tune for a CPU that has
   pipelined multu.  */
/* { dg-do compile } */
/* { dg-options "-mgp32 -mdspr2 -O2 -ffixed-hi -ffixed-lo -mtune=74kc" } */

/* { dg-final { scan-assembler "\tmultu\t" } } */
/* { dg-final { scan-assembler "ac1" } } */
/* { dg-final { scan-assembler "ac2" } } */
/* { dg-final { scan-assembler "ac3" } } */

typedef unsigned long long a64;

NOMIPS16 a64 test (a64 *a, unsigned int *b, unsigned int *c)
{
  a[0] = (a64) b[0] * c[0];
  a[1] = (a64) b[1] * c[1];
  a[2] = (a64) b[2] * c[2];
}

Regards,
Chao-ying


Re: x32 psABI draft version 0.2

2011-02-17 Thread H. Peter Anvin
On 02/17/2011 02:49 PM, Jan Hubicka wrote:
>> On Thu, Feb 17, 2011 at 04:44:53PM +0100, Jan Hubicka wrote:
> According to Mozilla folks however REL+RELA scheme used by EABI leads
> to significandly smaller libxul.so size
>
> According to http://glandium.org/blog/?p=1177 the difference is about 
> 4-5MB
> (out of approximately 20-30MB shared lib)

 This is orthogonal to x32 psABI.
>>>
>>> Understood.  I am just pointing out that x86-64 Mozilla suffers from startup
>>> problems (extra 5MB of disk read needed) compared to both x86 and ARM EABI
>>> because x86-64 ABI is RELA only. If x86-64 ABI was REL+RELA like EABI is, we
>>> would not have this problem here.
>>
>> libxul.so has < 20 relocs, so 5MB is total size of .rela section in
>> 64-bit ELF, you don't magically save those 5MB by using REL.  You save
>> just 1.5MB.  And for x32 we'd be talking about 2.5MB for RELA vs. 1.6MB for
> 
> The blog claims
> Architecture  libxul.so size  relocations size%
> x86   21,869,684  1,884,864   8.61%
> x86-6429,629,040  5,751,984   19.41%
> 
> The REL encoding also grows twice for 64bit target?
> 

REL would be twice the size for a 64-bit target (which x32 is not, from
an ELF point of view).  Keep in mind that REL cannot do error handing
very well, especially not on a 64-bit platform.

Elf32_Rel:   8 bytes
Elf32_Rela: 12 bytes
Elf64_Rel:  16 bytes
Elf64_Rela: 24 bytes

So 1,884,864 to 5,751,984 indicates a (very) small increase in
relocation count, the exactly equivalent numbers would be:

Elf32_Rel:  1,884,864 bytes
Elf32_Rela: 2,827,296 bytes
Elf64_Rel:  3,769,728 bytes
Elf64_Rela: 5,654,592 bytes

-hpa


gcc-4.5-20110217 is now available

2011-02-17 Thread gccadmin
Snapshot gcc-4.5-20110217 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20110217/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.5 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch 
revision 170258

You'll find:

 gcc-4.5-20110217.tar.bz2 Complete GCC (includes all of below)

  MD5=7f76f39dea58ae2c0a7727f7c2c461fc
  SHA1=c3472729a9ac88fd145eaa2c856d061d0cf7

 gcc-core-4.5-20110217.tar.bz2C front end and core compiler

  MD5=4a164c83af1439c0e80f273defcb4ad9
  SHA1=e5750a575078ba2eeb2b38b94467259cf079a8b7

 gcc-ada-4.5-20110217.tar.bz2 Ada front end and runtime

  MD5=89a27ff5648c0a50109df9c199de4a9b
  SHA1=57bd25a63bd153ffeccd0d75e83da33cdea28e7f

 gcc-fortran-4.5-20110217.tar.bz2 Fortran front end and runtime

  MD5=9acfe119613c639cdd958d66530f81d4
  SHA1=13454adff22d251d7972aa3e3fcca3a895507427

 gcc-g++-4.5-20110217.tar.bz2 C++ front end and runtime

  MD5=d20c95351ae9c65d7f47ddd1b0058d40
  SHA1=ea543565593a0537e4c3258d37b3756fd64c83bf

 gcc-go-4.5-20110217.tar.bz2  Go front end and runtime

  MD5=23f7f09330a117b2475de1fa183a3697
  SHA1=dc4d8c2e3020e0c7d6f47fdf3b28dd35fda5c68f

 gcc-java-4.5-20110217.tar.bz2Java front end and runtime

  MD5=778b13b2024a2511365c13699ff428f8
  SHA1=a5eebbfc90d8f74d9e0c2312027de696225eee1c

 gcc-objc-4.5-20110217.tar.bz2Objective-C front end and runtime

  MD5=3f63c38b48d10a5a6f07701a671c9de3
  SHA1=1237fd6ecd1e6d989eeae20637ab9e50377b2aac

 gcc-testsuite-4.5-20110217.tar.bz2   The GCC testsuite

  MD5=12e0d9eca36ad5319c3548364dbccef3
  SHA1=25625f861bcec17f089d967ce1e444d825232292

Diffs from 4.5-20110210 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.5
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: x32 psABI draft version 0.2

2011-02-17 Thread Jan Hubicka
> On Thu, Feb 17, 2011 at 04:44:53PM +0100, Jan Hubicka wrote:
> > > > According to Mozilla folks however REL+RELA scheme used by EABI leads
> > > > to significandly smaller libxul.so size
> > > >
> > > > According to http://glandium.org/blog/?p=1177 the difference is about 
> > > > 4-5MB
> > > > (out of approximately 20-30MB shared lib)
> > > 
> > > This is orthogonal to x32 psABI.
> > 
> > Understood.  I am just pointing out that x86-64 Mozilla suffers from startup
> > problems (extra 5MB of disk read needed) compared to both x86 and ARM EABI
> > because x86-64 ABI is RELA only. If x86-64 ABI was REL+RELA like EABI is, we
> > would not have this problem here.
> 
> libxul.so has < 20 relocs, so 5MB is total size of .rela section in
> 64-bit ELF, you don't magically save those 5MB by using REL.  You save
> just 1.5MB.  And for x32 we'd be talking about 2.5MB for RELA vs. 1.6MB for

The blog claims
Architecturelibxul.so size  relocations size%
x86 21,869,684  1,884,864   8.61%
x86-64  29,629,040  5,751,984   19.41%

The REL encoding also grows twice for 64bit target?

> REL.  There might be better ways how to get the numbers down.

These are difficult since they mostly come from vtables and we need to be
pretty smart to optimize vtable out completely.  Mozilla recently started to
use elfhack (in official builds) that is sort of their own dynamic linker
handling PC relative relcoations only.  Pretty ugly IMO but they claim 16%
savings on x86-64, 6% on x86

Honza
> 
>   Jakub


MIPS: Changing the PC stored from a "and link" instruction

2011-02-17 Thread Brandon H. Dwiel

Hello,

I am a student working on a project involving generating a MIPS processor.
We have decided to NOT implement logic to handle branch delay slots and instead
work with generating a compiler that will emit code without these delay slots.
The compiler tool chain versions are:

binutils:  2.15
gcc:   3.4.5
glibc: 2.3.6
linux-headers: 2.6.12

So far, I have been mostly successful in removing these delay slots. (nops still
exist in some situations).
However, the offsets are still with respect to the branch delay slot.
This is not a problem in most cases because we just add 4 to the offset in the 
pipeline
instead of adding the PC of the delay-slot instruction to the offset. There is 
one
case where this solution does not work. Here is the segment of code:

  400184: 04110001  bal 40018c<__start+0xc>

  400188:   nop

  40018c: 3c1c0fc1  lui gp,0xfc1

  400190: 279cbc64  addiu gp,gp,-17308

  400194: 039fe021  addu  gp,gp,ra


The bal instruction here is expected to push the PC of the instruction after
the delay slot (0x40018c). This value is then later added to the $gp register
in the last instruction of the sequence. Because of this behavior, putting
the PC of the instruction after the branch results in incorrect execution.

Looking at another segment of code from the same program:

  400204: 0320f809  jalr  t9

  400208: 8fbc0010  lw  gp,16(sp)

  40020c: 8fbf0018  lw  ra,24(sp)


Notice here that if I push the PC of the instruction after the delay slot 
(0x40020c),
then I will skip the load that writes to the $gp. I realize that in this case 
it is possible
that 0x40020c is expected to be pushed, but I have not noticed any
repercussions of simply pushing 0x400208 instead.

I would like to make the changes necessary so that the compiler expects the PC 
of the
instruction directly after the branch to be put in the $ra register.

I cannot locate where it is specified that PC+8 of an "and link" instruction is 
to
be put in the $ra so that I may change it.

-Brandon



Re: x32 psABI draft version 0.2

2011-02-17 Thread H. Peter Anvin
On 02/17/2011 10:06 AM, Jakub Jelinek wrote:
> On Thu, Feb 17, 2011 at 04:44:53PM +0100, Jan Hubicka wrote:
 According to Mozilla folks however REL+RELA scheme used by EABI leads
 to significandly smaller libxul.so size

 According to http://glandium.org/blog/?p=1177 the difference is about 4-5MB
 (out of approximately 20-30MB shared lib)
>>>
>>> This is orthogonal to x32 psABI.
>>
>> Understood.  I am just pointing out that x86-64 Mozilla suffers from startup
>> problems (extra 5MB of disk read needed) compared to both x86 and ARM EABI
>> because x86-64 ABI is RELA only. If x86-64 ABI was REL+RELA like EABI is, we
>> would not have this problem here.
> 
> libxul.so has < 20 relocs, so 5MB is total size of .rela section in
> 64-bit ELF, you don't magically save those 5MB by using REL.  You save
> just 1.5MB.  And for x32 we'd be talking about 2.5MB for RELA vs. 1.6MB for
> REL.  There might be better ways how to get the numbers down.
> 

The size is, of course, half of that for the x32 ABI in the first place.

-hpa



Re: x32 psABI draft version 0.2

2011-02-17 Thread Joseph S. Myers
On Thu, 17 Feb 2011, Jan Hubicka wrote:

> > REL is horrible pain, we shouldn't ever add new REL targets.
> 
> According to Mozilla folks however REL+RELA scheme used by EABI leads
> to significandly smaller libxul.so size
> 
> According to http://glandium.org/blog/?p=1177 the difference is about 4-5MB
> (out of approximately 20-30MB shared lib)

Well - as noted there, while EABI supports both, binutils just uses REL on 
ARM (except for VxWorks, where it uses RELA).  Supporting both 
simultaneously - linking a mixture of objects with the two types, possibly 
with both mixed within a single object - is much more of a pain (at least 
for the BFD linker, I don't know about gold) than simply supporting one or 
the other for a single ABI - it's required by ARM EABI, but that bit of 
EABI isn't actually supported by binutils.  I think most of the 
target-independent BFD problems with mixing REL and RELA have been ironed 
out by now in the course of Bernd's work on linking objects from TI's C6X 
tools, but it also needs care in the target-specific BFD code (you always 
need to check for each individual relocation what sort it is, rather than 
checking any global flag).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Strange behavior with templates and G++

2011-02-17 Thread Jonathan Wakely
On 17 February 2011 17:26, Pascal Francq wrote:
> In fact, since it is related on how g++ is implementing its heritage mechanism
> in C++, I was thinking it was the right mailing-list. In fact, by adding a
> convenient method in Super2, the code compiles :
>        template inline void Test(C* ptr) {Super::Test(ptr);}
> This trick let me suppose that a problem may exist in the way g++ links a call
> to the corresponding method when it handles template classes.

Unless you're proposing a way to fix it (which you're not, you're
asking if there's something wrong, and there isn't) then gcc-help is
the right list.


Re: x32 psABI draft version 0.2

2011-02-17 Thread Jakub Jelinek
On Thu, Feb 17, 2011 at 04:44:53PM +0100, Jan Hubicka wrote:
> > > According to Mozilla folks however REL+RELA scheme used by EABI leads
> > > to significandly smaller libxul.so size
> > >
> > > According to http://glandium.org/blog/?p=1177 the difference is about 
> > > 4-5MB
> > > (out of approximately 20-30MB shared lib)
> > 
> > This is orthogonal to x32 psABI.
> 
> Understood.  I am just pointing out that x86-64 Mozilla suffers from startup
> problems (extra 5MB of disk read needed) compared to both x86 and ARM EABI
> because x86-64 ABI is RELA only. If x86-64 ABI was REL+RELA like EABI is, we
> would not have this problem here.

libxul.so has < 20 relocs, so 5MB is total size of .rela section in
64-bit ELF, you don't magically save those 5MB by using REL.  You save
just 1.5MB.  And for x32 we'd be talking about 2.5MB for RELA vs. 1.6MB for
REL.  There might be better ways how to get the numbers down.

Jakub


Re: x32 psABI draft version 0.2

2011-02-17 Thread H.J. Lu
On Thu, Feb 17, 2011 at 8:11 AM, Jan Beulich  wrote:
 On 17.02.11 at 16:49, "H.J. Lu"  wrote:
>> On Thu, Feb 17, 2011 at 7:44 AM, Jan Hubicka  wrote:
 > According to Mozilla folks however REL+RELA scheme used by EABI leads
 > to significandly smaller libxul.so size
 >
 > According to http://glandium.org/blog/?p=1177 the difference is about 
 > 4-5MB
 > (out of approximately 20-30MB shared lib)

 This is orthogonal to x32 psABI.
>>>
>>> Understood.  I am just pointing out that x86-64 Mozilla suffers from startup
>>> problems (extra 5MB of disk read needed) compared to both x86 and ARM EABI
>>> because x86-64 ABI is RELA only. If x86-64 ABI was REL+RELA like EABI is, we
>>> would not have this problem here.
>>>
>>
>> If people want to see REL+RELA in x32, they have to contribute codes.
>
> That's exactly the wrong way round: First the specification has to allow
> for (but not require) it, and only then does it make sense to write code.
>

No, it has to be supported at least by static linker and dynamic
linker. Otherwise, no one can use it.

-- 
H.J.


Re: Strange behavior with templates and G++

2011-02-17 Thread Pascal Francq
In fact, since it is related on how g++ is implementing its heritage mechanism 
in C++, I was thinking it was the right mailing-list. In fact, by adding a 
convenient method in Super2, the code compiles :
template inline void Test(C* ptr) {Super::Test(ptr);}
This trick let me suppose that a problem may exist in the way g++ links a call 
to the corresponding method when it handles template classes.

On jeudi 17 février 2011, Jonathan Wakely wrote:
> On 17 February 2011 11:03, Pascal Francq wrote:
> > Is this a problem related to a misunderstood concept from me, a wrong
> > implementation or a technical problem of g++ ?
> 
> In any of those cases, such questions are off-topic on this mailing
> list, which is for development of gcc, not help using it.  Please see
> http://gcc.gnu.org/lists.html and ask the question on the gcc-help
> list, thanks.
> 
> I believe G++ is correct in this case.


-- 

Dr. Ir. Pascal Francq
BELGIUM


signature.asc
Description: This is a digitally signed message part.


Re: x32 psABI draft version 0.2

2011-02-17 Thread Jan Beulich
>>> On 17.02.11 at 16:49, "H.J. Lu"  wrote:
> On Thu, Feb 17, 2011 at 7:44 AM, Jan Hubicka  wrote:
>>> > According to Mozilla folks however REL+RELA scheme used by EABI leads
>>> > to significandly smaller libxul.so size
>>> >
>>> > According to http://glandium.org/blog/?p=1177 the difference is about 
>>> > 4-5MB
>>> > (out of approximately 20-30MB shared lib)
>>>
>>> This is orthogonal to x32 psABI.
>>
>> Understood.  I am just pointing out that x86-64 Mozilla suffers from startup
>> problems (extra 5MB of disk read needed) compared to both x86 and ARM EABI
>> because x86-64 ABI is RELA only. If x86-64 ABI was REL+RELA like EABI is, we
>> would not have this problem here.
>>
> 
> If people want to see REL+RELA in x32, they have to contribute codes.

That's exactly the wrong way round: First the specification has to allow
for (but not require) it, and only then does it make sense to write code.

Jan



Re: x32 psABI draft version 0.2

2011-02-17 Thread H.J. Lu
On Thu, Feb 17, 2011 at 7:44 AM, Jan Hubicka  wrote:
>> > According to Mozilla folks however REL+RELA scheme used by EABI leads
>> > to significandly smaller libxul.so size
>> >
>> > According to http://glandium.org/blog/?p=1177 the difference is about 4-5MB
>> > (out of approximately 20-30MB shared lib)
>>
>> This is orthogonal to x32 psABI.
>
> Understood.  I am just pointing out that x86-64 Mozilla suffers from startup
> problems (extra 5MB of disk read needed) compared to both x86 and ARM EABI
> because x86-64 ABI is RELA only. If x86-64 ABI was REL+RELA like EABI is, we
> would not have this problem here.
>

If people want to see REL+RELA in x32, they have to contribute codes.

-- 
H.J.


Re: x32 psABI draft version 0.2

2011-02-17 Thread Jan Hubicka
> > According to Mozilla folks however REL+RELA scheme used by EABI leads
> > to significandly smaller libxul.so size
> >
> > According to http://glandium.org/blog/?p=1177 the difference is about 4-5MB
> > (out of approximately 20-30MB shared lib)
> 
> This is orthogonal to x32 psABI.

Understood.  I am just pointing out that x86-64 Mozilla suffers from startup
problems (extra 5MB of disk read needed) compared to both x86 and ARM EABI
because x86-64 ABI is RELA only. If x86-64 ABI was REL+RELA like EABI is, we
would not have this problem here.

Honza


Re: x32 psABI draft version 0.2

2011-02-17 Thread H.J. Lu
On Thu, Feb 17, 2011 at 7:22 AM, Jan Hubicka  wrote:
>> On Thu, Feb 17, 2011 at 08:35:26AM +, Jan Beulich wrote:
>> > >>> On 16.02.11 at 21:04, "H. Peter Anvin"  wrote:
>> > > On 02/16/2011 11:22 AM, H.J. Lu wrote:
>> > >> Hi,
>> > >>
>> > >> I updated  x32 psABI draft to version 0.2 to change x32 library path
>> > >> from lib32 to libx32 since lib32 is used for ia32 libraries on Debian,
>> > >> Ubuntu and other derivative distributions. The new x32 psABI is
>> > >> available from:
>> > >>
>> > >> https://sites.google.com/site/x32abi/home
>> > >>
>> > >
>> > > I'm wondering if we should define a section header flag (sh_flags)
>> > > and/or an ELF header flag (e_flags) for x32 for the people unhappy about
>> > > keying it to the ELF class...
>> >
>> > Thanks for supporting this!
>> >
>> > Besides that I also wonder why all the 64-bit relocations get
>> > marked as LP64-only. It is clear that some of them can be useful
>> > in ILP32 as well, and there's no reason to preclude future uses
>> > even if currently no-one can imagine any.
>> >
>> > Furthermore, it seems questionable to continue to require rela
>> > relocations when for all normal ones (leaving aside the 8- and 16-
>> > bit ones) the addend can fit in the relocated field.
>>
>> REL is horrible pain, we shouldn't ever add new REL targets.
>
> According to Mozilla folks however REL+RELA scheme used by EABI leads
> to significandly smaller libxul.so size
>
> According to http://glandium.org/blog/?p=1177 the difference is about 4-5MB
> (out of approximately 20-30MB shared lib)

This is orthogonal to x32 psABI.



-- 
H.J.


Re: x32 psABI draft version 0.2

2011-02-17 Thread Jan Hubicka
> On Thu, Feb 17, 2011 at 08:35:26AM +, Jan Beulich wrote:
> > >>> On 16.02.11 at 21:04, "H. Peter Anvin"  wrote:
> > > On 02/16/2011 11:22 AM, H.J. Lu wrote:
> > >> Hi,
> > >> 
> > >> I updated  x32 psABI draft to version 0.2 to change x32 library path
> > >> from lib32 to libx32 since lib32 is used for ia32 libraries on Debian,
> > >> Ubuntu and other derivative distributions. The new x32 psABI is
> > >> available from:
> > >> 
> > >> https://sites.google.com/site/x32abi/home 
> > >> 
> > > 
> > > I'm wondering if we should define a section header flag (sh_flags)
> > > and/or an ELF header flag (e_flags) for x32 for the people unhappy about
> > > keying it to the ELF class...
> > 
> > Thanks for supporting this!
> > 
> > Besides that I also wonder why all the 64-bit relocations get
> > marked as LP64-only. It is clear that some of them can be useful
> > in ILP32 as well, and there's no reason to preclude future uses
> > even if currently no-one can imagine any.
> > 
> > Furthermore, it seems questionable to continue to require rela
> > relocations when for all normal ones (leaving aside the 8- and 16-
> > bit ones) the addend can fit in the relocated field.
> 
> REL is horrible pain, we shouldn't ever add new REL targets.

According to Mozilla folks however REL+RELA scheme used by EABI leads
to significandly smaller libxul.so size

According to http://glandium.org/blog/?p=1177 the difference is about 4-5MB
(out of approximately 20-30MB shared lib)

Honza
> 
>   Jakub


Re: x32 psABI draft version 0.2

2011-02-17 Thread Jakub Jelinek
On Thu, Feb 17, 2011 at 08:35:26AM +, Jan Beulich wrote:
> >>> On 16.02.11 at 21:04, "H. Peter Anvin"  wrote:
> > On 02/16/2011 11:22 AM, H.J. Lu wrote:
> >> Hi,
> >> 
> >> I updated  x32 psABI draft to version 0.2 to change x32 library path
> >> from lib32 to libx32 since lib32 is used for ia32 libraries on Debian,
> >> Ubuntu and other derivative distributions. The new x32 psABI is
> >> available from:
> >> 
> >> https://sites.google.com/site/x32abi/home 
> >> 
> > 
> > I'm wondering if we should define a section header flag (sh_flags)
> > and/or an ELF header flag (e_flags) for x32 for the people unhappy about
> > keying it to the ELF class...
> 
> Thanks for supporting this!
> 
> Besides that I also wonder why all the 64-bit relocations get
> marked as LP64-only. It is clear that some of them can be useful
> in ILP32 as well, and there's no reason to preclude future uses
> even if currently no-one can imagine any.
> 
> Furthermore, it seems questionable to continue to require rela
> relocations when for all normal ones (leaving aside the 8- and 16-
> bit ones) the addend can fit in the relocated field.

REL is horrible pain, we shouldn't ever add new REL targets.

Jakub


Re: x32 psABI draft version 0.2

2011-02-17 Thread H.J. Lu
On Thu, Feb 17, 2011 at 12:35 AM, Jan Beulich  wrote:
 On 16.02.11 at 21:04, "H. Peter Anvin"  wrote:
>> On 02/16/2011 11:22 AM, H.J. Lu wrote:
>>> Hi,
>>>
>>> I updated  x32 psABI draft to version 0.2 to change x32 library path
>>> from lib32 to libx32 since lib32 is used for ia32 libraries on Debian,
>>> Ubuntu and other derivative distributions. The new x32 psABI is
>>> available from:
>>>
>>> https://sites.google.com/site/x32abi/home
>>>
>>
>> I'm wondering if we should define a section header flag (sh_flags)
>> and/or an ELF header flag (e_flags) for x32 for the people unhappy about
>> keying it to the ELF class...
>
> Thanks for supporting this!

I am not convinced.

> Besides that I also wonder why all the 64-bit relocations get
> marked as LP64-only. It is clear that some of them can be useful
> in ILP32 as well, and there's no reason to preclude future uses
> even if currently no-one can imagine any.

We can revisit them when someone finds a use for them.

> Furthermore, it seems questionable to continue to require rela
> relocations when for all normal ones (leaving aside the 8- and 16-
> bit ones) the addend can fit in the relocated field.

Rela is much nicer to work with.

> Finally, shouldn't R_X86_64_GLOB_DAT and R_X86_64_JUMP_SLOT

Fixed in git.

> also have a field specifier of wordclass rather than word64 (though
> 'wordclass' by itself would probably be wrong if the tying of the ABI
> to the ELF class was eliminated)? And how about R_X86_64_*TP*64
> and R_X86_64_TLSDESC?

Those are 64bits due to the way the code sequence generated
by gcc.

-- 
H.J.


Re: Strange behavior with templates and G++

2011-02-17 Thread Jonathan Wakely
On 17 February 2011 11:03, Pascal Francq wrote:
>
> Is this a problem related to a misunderstood concept from me, a wrong
> implementation or a technical problem of g++ ?

In any of those cases, such questions are off-topic on this mailing
list, which is for development of gcc, not help using it.  Please see
http://gcc.gnu.org/lists.html and ask the question on the gcc-help
list, thanks.

I believe G++ is correct in this case.


Strange behavior with templates and G++

2011-02-17 Thread Pascal Francq
Hi,
While compiling the following code, I got an error :


template class Super
{
public:
Super(void) {}
void Test(C*) {}
};

class A
{
public:
A(void) {}
};

class A1 : public A
{
public:
A1(void) : A() {}
};

class A2 : public A
{
public:
A2(void) : A() {}
};

class Super2 : public Super, public Super
{
public:
Super2(void) {}
};

void Test(void)
{
Super2 T;
A1* ptr;
T.Test(ptr);
}


The compiler gives me the following error for the Test() function:
error: request for member ‘Test’ is ambiguous
error: candidates are:  void Super::Test(C*) [with C = A2]
error:  void Super::Test(C*) [with C = A1]
But here the call refers clearly to the second method. The error still appears 
if A1 and A2 do not inherit from a same root class. If I replace the code with 
an explicit call it works:
T.Super::Test(ptr)
But this make the code less cleaner. 

Is this a problem related to a misunderstood concept from me, a wrong 
implementation or a technical problem of g++ ?

Thanks.
-- 

Dr. Ir. Pascal Francq
BELGIUM


signature.asc
Description: This is a digitally signed message part.


%* in SPEC strings

2011-02-17 Thread Joern Rennecke

In the Specs Language documentation in gcc.c it says:

 %* substitute the variable part of a matched option.  (See below.)
Note that each comma in the substituted string is replaced by
a single space.

However that is not quite what it does.  It actually appends a space  
at the end.


For a port that we hope to contribute in the future, besides a Linux version,
we need to support multiple target board variants (a set that is still  
growing)

for a bare-metal newlib-based toolchain.  We want to do this by linking in
an extra board-specific file in STARTFILE_SPECS.
The board is specified as part of an -m option that also triggers common SPEC
adaptations for the newlib-based toolchain.
According to the documentation, we should be able to specify the file with
name-prefix%*.o .  However, because of the undocumented insertion of ' ',
this goes badly awry.

The emission of the extra ' ' goes back all the way to the dawn of RCS control
of GCC sources, so getting rid of it altogether is not likely to be easy.
However, most uses either have a whitespace character next - so the extra ' '
is redundant - or they have a '}' character next - in which case it seems
sensible to keep the ' ' for now.
There are two other cases in config/{,*/}*:
- LINK_SYSROOT_SPEC in config/darwin.h .  At the only place where
  LINK_SYSROOT_SPEC is used, a space follows.
- LINK_SPEC in config/m68k/uclinux.h .  The use is at one of the two
  alternative ends of LINK_SPEC; besides, the other end doesnt' have a
  trailing space, either.

Thus, it seems safe to stop the emitting of the extra space if the next
character is not '}' .
Would a patch like the one below be considered when 4.7 phase 1 opens?
2011-02-16  Joern Rennecke  

	* gcc.c (do_spec_1) <%*>: Don't append a space unless the next
	character is '}'.

Index: gcc.c
===
--- gcc.c	(revision 1452)
+++ gcc.c	(working copy)
@@ -5868,7 +5868,12 @@ do_spec_1 (const char *spec, int inswitc
 	if (soft_matched_part)
 	  {
 		do_spec_1 (soft_matched_part, 1, NULL);
-		do_spec_1 (" ", 0, NULL);
+		/* ??? Emitting a space after soft_matched_part gets in
+		   is undocumented and gets in the way of doing useful
+		   file name pasting; but for backward compatibility, we
+		   keep this behaviour when the next character is '}'.  */
+		if (p[1] == '}')
+		  do_spec_1 (" ", 0, NULL);
 	  }
 	else
 	  /* Catch the case where a spec string contains something like


Re: x32 psABI draft version 0.2

2011-02-17 Thread Jan Beulich
>>> On 16.02.11 at 21:04, "H. Peter Anvin"  wrote:
> On 02/16/2011 11:22 AM, H.J. Lu wrote:
>> Hi,
>> 
>> I updated  x32 psABI draft to version 0.2 to change x32 library path
>> from lib32 to libx32 since lib32 is used for ia32 libraries on Debian,
>> Ubuntu and other derivative distributions. The new x32 psABI is
>> available from:
>> 
>> https://sites.google.com/site/x32abi/home 
>> 
> 
> I'm wondering if we should define a section header flag (sh_flags)
> and/or an ELF header flag (e_flags) for x32 for the people unhappy about
> keying it to the ELF class...

Thanks for supporting this!

Besides that I also wonder why all the 64-bit relocations get
marked as LP64-only. It is clear that some of them can be useful
in ILP32 as well, and there's no reason to preclude future uses
even if currently no-one can imagine any.

Furthermore, it seems questionable to continue to require rela
relocations when for all normal ones (leaving aside the 8- and 16-
bit ones) the addend can fit in the relocated field.

Finally, shouldn't R_X86_64_GLOB_DAT and R_X86_64_JUMP_SLOT
also have a field specifier of wordclass rather than word64 (though
'wordclass' by itself would probably be wrong if the tying of the ABI
to the ELF class was eliminated)? And how about R_X86_64_*TP*64
and R_X86_64_TLSDESC?

Jan