Re: on define_peephole2

2005-07-21 Thread Liu Haibin
On 7/21/05, Liu Haibin <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> I have a problem on the define_peephole2. In nois2.md, there's such a
> define_insn
> 
> (define_insn "addsi3"
>   [(set (match_operand:SI 0 "register_operand"  "=r,r")
> (plus:SI (match_operand:SI 1 "register_operand" "%r,r")
>  (match_operand:SI 2 "arith_operand" "r,I")))]
>   ""
>   "add%i2\\t%0, %1, %z2"
>   [(set_attr "type" "alu")])
> 
> I defined a peephole2 to replace this instruction.
> 
> (define_peephole2
>   [(set (match_operand:SI 0 "register_operand" "=r")
> (plus:SI (match_operand:SI 1 "register_operand" "%r")
> ;(match_operand:SI 2 "arith_operand" "r")))]
>  (match_operand:SI 2 "register_operand" "r")))]
>   ""
>   [(set (match_operand:SI 0 "register_operand" "=r")
> (unspec_volatile:SI [(match_operand:SI 4 "custom_insn_opcode" "N")

my mistake. should be match_operand:SI 3 here. Now no more error.

> (match_operand:SI 1 "register_operand" "r")
> (match_operand:SI 2 "register_operand" "r")] 
> CUSTOM_INII))]
>   "
> {
> operands[4] = const0_rtx;
> }")
> 
> Because the operand 2 in the replacing instruction must be a register,
> I changed the "arith_operand" to "register_operand", hoping that it
> only replaces something like, add r1, r2, r3 instead of addi r1, r2, 9
> 
> I did a test with a file, which contains
> 
> (insn/f 106 73 107 0 0x0 (set:SI (reg/f:SI 27 sp)
> (plus:SI (reg/f:SI 27 sp)
> (const_int -16 [0xfff0]))) -1 (nil)
> (nil))
> 
> and it seems that it did try to replace it with the new instruct. And
> I got the following error:
> 
> isqrt.c:65: error: unrecognizable insn:
> (insn 123 73 107 0 0x0 (set (reg/f:SI 27 sp)
> (unspec_volatile:SI [
> (const_int 0 [0x0])
> (reg/f:SI 27 sp)
> (const_int -16 [0xfff0])
> ] 117)) -1 (nil)
> (nil))
> isqrt.c:65: internal compiler error: in extract_insn, at recog.c:2175
> 
> Any ideas why it still tries to replace it even when it's obviously
> not a register (const_int -16)? Thanks.
> 
> 
> Regards,
> Timothy
>


Re: Function Inlining for FORTRAN

2005-07-21 Thread Paul Brook
> > The biggest problem is type consistency and aliasing. Consider the
> > following
>
> I have several FORTRAN 77 programs. After inlining the small functions in
> them by hand, they made a great performance improvements. So I need a trial
> implementation of function inlining to verify the effectiveness of it.
>
> Now, my question is: If we just take the FORTRAN 77 syntax into account (no
> derived types, no complex aliasing), may it be simpler to implement
> function inlining for FORTRAN 77.

Maybe, but gfortran is a fortran 95 compiler so this is not an acceptable 
solution.

Paul


Re: Function Inlining for FORTRAN

2005-07-21 Thread Canqun Yang
Paul Brook <[EMAIL PROTECTED]>:

> On Wednesday 20 July 2005 15:35, Canqun Yang wrote:
> > Hi, all
> >
> > Function inlining for FORTRAN programs always fails. 
> 
> Not entirely true. Inlining of contained procedures works fine (or it did la
> st 
> time I checked). This should include inlining of siblings within a module.
> 
> > If no one engages in it, I will give a try. Would you please give me
> > some clues? 
> 
> The problem is that each top level program unit (PU)[1] is compiled 
> separately. Each PU has it's own "external" decls for all function calls, 
> even if the function happens to be in the same function. Thus each PU is an 
> 
> isolated self-contained tree structure, and the callgraph doesn't know the 
> definition and declaration are actually the same thing.
> 
> Basically what you need to do is parse the whole file, then start generating
>  
> code.
> 
> Unfortunately this isn't simple (or it would have been fixed already!).
> Unlike C Fortran doesn't have file-level scope. It makes absolutely no 
> difference whether two procedures are in the same file, or in different 
> files.  You get all the problems that multifile IPA in C experiences within 
> a 
> single Fortran file. 
> 
> The biggest problem is type consistency and aliasing. Consider the following
>  

I have several FORTRAN 77 programs. After inlining the small functions in them 
by hand, they 
made a great performance improvements. So I need a trial implementation of 
function inlining to 
verify the effectiveness of it.

Now, my question is: If we just take the FORTRAN 77 syntax into account (no 
derived types, no 
complex aliasing), may it be simpler to implement function inlining for FORTRAN 
77.

> 
> Paul
> 


Canqun Yang
Creative Compiler Research Group.
National University of Defense Technology, China.


Re: No download link from gcc.gnu.org

2005-07-21 Thread Russ Allbery
bhiksha <[EMAIL PROTECTED]> writes:

> I simply cannot find any direct link to a downloadable source/binary
> bundle for gcc4 from gcc.gnu.org.

I went to gcc.gnu.org, clicked on "Mirror sites" under Download, chose an
appropriate mirror for my region, clicked on "releases", and found gcc 4.0
and 4.0.1.

> The list of releases on the releases page ends at 3.4.4. Every other
> link Ive chased down stops at 3.4.4.

Could you say exactly what pages you looked at?  It's hard to fix the
problem from the amount of information that you've given.

-- 
Russ Allbery ([EMAIL PROTECTED]) 


[C++ RFC] Debug info for anonymous aggregates

2005-07-21 Thread Devang Patel
C++ does not generate debug info for anonymous aggregates in cases  
like :


class A
{
public:
typedef struct
{
int d;
} mystruct;
mystruct data;
};

This is because FE sets DECL_IGNORED_P bit. This causes debug info  
generator to
skip debug info when invoked through rest_of_type_compilation(). The  
fix I am
testing over night is to reset DECL_IGNORED_P bit when real name is  
assigned

to anonymous aggregates and invoke debug_hooks again.

Is this the right approach? If yes then based on gcc and gdb dejagnu  
results

I'll prepare actual patch.

Thanks,
-
Devang

Index: decl.c
===
RCS file: /cvs/gcc/gcc/gcc/cp/decl.c,v
retrieving revision 1.1364.2.6
diff -Idpatel.pbxuser -c -3 -p -r1.1364.2.6 decl.c
*** decl.c  9 Jul 2005 22:07:00 -   1.1364.2.6
--- decl.c  22 Jul 2005 00:51:45 -
*** grokdeclarator (const cp_declarator *dec
*** 7706,7712 
  /* Replace the anonymous name with the real name  
everywhere.  */
  for (t = TYPE_MAIN_VARIANT (type); t; t =  
TYPE_NEXT_VARIANT (t))

if (TYPE_NAME (t) == oldname)
! TYPE_NAME (t) = decl;

  if (TYPE_LANG_SPECIFIC (type))
TYPE_WAS_ANONYMOUS (type) = 1;
--- 7706,7722 
  /* Replace the anonymous name with the real name  
everywhere.  */
  for (t = TYPE_MAIN_VARIANT (type); t; t =  
TYPE_NEXT_VARIANT (t))

if (TYPE_NAME (t) == oldname)
! {
!   TYPE_NAME (t) = decl;
!
!   /* Debug info was not generated earlier for anonymous  
aggregates.
!  Now is the time generate debug info for such  
types.  */

!   if (ANON_AGGRNAME_P (DECL_NAME(oldname)))
! {
!   DECL_IGNORED_P (TYPE_STUB_DECL (t)) = 0;
!   debug_hooks->type_decl (TYPE_STUB_DECL (t),  
LOCAL_CLASS_P (t));

! }
! }

  if (TYPE_LANG_SPECIFIC (type))
TYPE_WAS_ANONYMOUS (type) = 1;
Index: name-lookup.c
===
RCS file: /cvs/gcc/gcc/gcc/cp/name-lookup.c,v
retrieving revision 1.109.2.2
diff -Idpatel.pbxuser -c -3 -p -r1.109.2.2 name-lookup.c
*** name-lookup.c   9 Jul 2005 22:07:03 -   1.109.2.2
--- name-lookup.c   22 Jul 2005 00:51:45 -
*** pushtag (tree name, tree type, tag_scope
*** 4675,4681 
  else
d = pushdecl_with_scope (d, b);

! /* FIXME what if it gets a name from typedef?  */
  if (ANON_AGGRNAME_P (name))
DECL_IGNORED_P (d) = 1;

--- 4675,4682 
  else
d = pushdecl_with_scope (d, b);

! /* If it gets a name from typedef, reset DECL_IGNORED_P flag
!and invoke debug_hooks again.  */
  if (ANON_AGGRNAME_P (name))
DECL_IGNORED_P (d) = 1;




No download link from gcc.gnu.org

2005-07-21 Thread bhiksha

Hi,

I simply cannot find any direct link to a downloadable source/binary  
bundle for gcc4 from

gcc.gnu.org.

The list of releases on the releases page ends at 3.4.4. Every other 
link Ive chased down

stops at 3.4.4.

4.0  can eventually be found, sans any documentation about what specific 
files one must download,
if one digs several links deep, from various sites around the world, but 
the gcc maintainers themselves

have not linked it to the main web page.

This is sad and absurd.   There must be a link to a download site. 
Otherwise, there must be some

explicit info on where to obtain it from.

I suspect that the GCC team simply did not consider it important enough 
to put up the release in

an easily accessible place..

Thanks.



RE: splitting load immediates using high and lo_sum

2005-07-21 Thread Tabony, Charles
> From: Dale Johannesen [mailto:[EMAIL PROTECTED]
> 
> On Jul 21, 2005, at 5:04 PM, Tabony, Charles wrote:
> 
> >> From: Dale Johannesen [mailto:[EMAIL PROTECTED]
> >>
> >> On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:
> >>
> >>> Hi,
> >>>
> >>> I am working on a port for a processor that has 32 bit registers
but
> >>> can
> >>> only load 16 bit immediates.
> >>>   ""
> >>>   "%0.h = #HI(%1)")
> >>
> >> What are the semantics of this?  Low bits zeroed, or untouched?
> >> If the former, your semantics are identical to Sparc; look at that.
> >
> > The low bits are untouched.  However, I would expect the compiler to
> > always follow setting the high bits with setting the low bits.
> 
> OK, if you're willing to accept that limitation (your architecture
could
> handle putting the LO first, which Sparc can't) then Sparc is still a
> good model to look at.  What it does should work for you.

Aha!  I looked at the SPARC code and distilled it down to what I needed
and the difference is that it sets the mode of the high and lo_sum
expressions to the mode of operand 0, while I was setting it to the mode
of operand 1.  Now mine works great.

Thank you,
Charles J. Tabony



Re: RFA: Darwin x86 alignment

2005-07-21 Thread Richard Henderson
On Thu, Jul 21, 2005 at 05:21:58PM -0700, Dale Johannesen wrote:
> >Nah, you just remove it from target_flags, and control the two
> >new variables from ix86_handle_option.
> 
> OK.  Think that's the better approach?

*shrug*  It's not horrible, I guess.  It preseves existing
semantics when people use the switch; not that I'm a large
fan of switches like this that bork the abi.

My preferred solution is that you don't allow non-compiler
people to invent an ABI.  ;-)


r~


Re: RFA: Darwin x86 alignment

2005-07-21 Thread Dale Johannesen


On Jul 21, 2005, at 5:00 PM, Richard Henderson wrote:


On Thu, Jul 21, 2005 at 04:56:01PM -0700, Dale Johannesen wrote:

- Have flags work as now:  -malign-double makes both 8,
-mno-align-double
  makes both 4.  Problem with that is the default is neither of these,
and
  this doesn't fit neatly into gcc's model of two-valued flags; it's
also a bit
  tricky to implement for the same reason.


Nah, you just remove it from target_flags, and control the two
new variables from ix86_handle_option.


OK.  Think that's the better approach?


Why do you want to make these sort of arbitrary changes to your
ABI?  I can't see what you win...


The compiler people are not driving this.

Of course, 4-byte alignment subjects you to a penalty for misaligned
loads and stores, and 8-byte alignment subjects you to a size penalty
for extra holes.   People have been making measurements about the
issue and this is what they've come up with; I don't know details.
What I wrote isn't necessarily the final change, either.



Re: splitting load immediates using high and lo_sum

2005-07-21 Thread Dale Johannesen


On Jul 21, 2005, at 5:04 PM, Tabony, Charles wrote:


From: Dale Johannesen [mailto:[EMAIL PROTECTED]

On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:


Hi,

I am working on a port for a processor that has 32 bit registers but
can
only load 16 bit immediates.
  ""
  "%0.h = #HI(%1)")


What are the semantics of this?  Low bits zeroed, or untouched?
If the former, your semantics are identical to Sparc; look at that.


The low bits are untouched.  However, I would expect the compiler to
always follow setting the high bits with setting the low bits.


OK, if you're willing to accept that limitation (your architecture could
handle putting the LO first, which Sparc can't) then Sparc is still a
good model to look at.  What it does should work for you.



RE: splitting load immediates using high and lo_sum

2005-07-21 Thread Tabony, Charles
> From: Dale Johannesen [mailto:[EMAIL PROTECTED]
> 
> On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:
> 
> > Hi,
> >
> > I am working on a port for a processor that has 32 bit registers but
> > can
> > only load 16 bit immediates.
> >   ""
> >   "%0.h = #HI(%1)")
> 
> What are the semantics of this?  Low bits zeroed, or untouched?
> If the former, your semantics are identical to Sparc; look at that.

The low bits are untouched.  However, I would expect the compiler to
always follow setting the high bits with setting the low bits.  The
point of splitting them is that I want the insn setting the high bits to
be scheduled in a vliw packet with the preceding insns and the insn
setting the low bits to be scheduled with the following insns whenever
possible.  The processor cannot execute both instructions in the same
cycle.

-Charles



Re: RFA: Darwin x86 alignment

2005-07-21 Thread Richard Henderson
On Thu, Jul 21, 2005 at 04:56:01PM -0700, Dale Johannesen wrote:
> - Have flags work as now:  -malign-double makes both 8, 
> -mno-align-double
>   makes both 4.  Problem with that is the default is neither of these, 
> and
>   this doesn't fit neatly into gcc's model of two-valued flags; it's 
> also a bit
>   tricky to implement for the same reason.

Nah, you just remove it from target_flags, and control the two
new variables from ix86_handle_option.

Why do you want to make these sort of arbitrary changes to your
ABI?  I can't see what you win...


r~


RFA: Darwin x86 alignment

2005-07-21 Thread Dale Johannesen

On x86 currently the alignments of double and long long are linked:
they are either 4 or 8 depending on whether -malign-double is set.
This follows the documentation of -malign-double.  But it's wrong for
what we want the Darwin ABI to be:  the default should be that double
is 4 bytes and long long is 8 bytes.

So I can do that, but what should -malign-double do?
- Control double but not long long; add -malign-long-long (at least if
   somebody asks for it; probably it wouldn't be used)
- Have flags work as now:  -malign-double makes both 8, 
-mno-align-double
  makes both 4.  Problem with that is the default is neither of these, 
and
  this doesn't fit neatly into gcc's model of two-valued flags; it's 
also a bit

  tricky to implement for the same reason.
- something else?

thanks.



Re: splitting load immediates using high and lo_sum

2005-07-21 Thread Dale Johannesen


On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:


Hi,

I am working on a port for a processor that has 32 bit registers but 
can

only load 16 bit immediates.
  ""
  "%0.h = #HI(%1)")


What are the semantics of this?  Low bits zeroed, or untouched?
If the former, your semantics are identical to Sparc; look at that.



gcc-4.0-20050721 is now available

2005-07-21 Thread gccadmin
Snapshot gcc-4.0-20050721 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.0-20050721/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.0 CVS branch
with the following options: -rgcc-ss-4_0-20050721 

You'll find:

gcc-4.0-20050721.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.0-20050721.tar.bz2 C front end and core compiler

gcc-ada-4.0-20050721.tar.bz2  Ada front end and runtime

gcc-fortran-4.0-20050721.tar.bz2  Fortran front end and runtime

gcc-g++-4.0-20050721.tar.bz2  C++ front end and runtime

gcc-java-4.0-20050721.tar.bz2 Java front end and runtime

gcc-objc-4.0-20050721.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.0-20050721.tar.bz2The GCC testsuite

Diffs from 4.0-20050714 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.0
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


splitting load immediates using high and lo_sum

2005-07-21 Thread Tabony, Charles
Hi,

I am working on a port for a processor that has 32 bit registers but can
only load 16 bit immediates.  I have tried several ways to split moves
with larger immediates into two RTL insns.  One is using a
define_expand:

-code---
(define_expand "movsi"
  [(set (match_operand:SI 0 "nonimmediate_operand" "")
(match_operand:SI 1 "general_operand" ""))]
  ""
  {
if (GET_CODE(operands[0]) != REG) {
  operands[1] = force_reg(SImode, operands[1]);
}
else if(s17to32_const_int_operand(operands[1],
GET_MODE(operands[1]))){
  emit_move_insn(operands[0],
 gen_rtx_HIGH(GET_MODE(operands[1]), operands[1]));
  emit_move_insn(operands[0],
 gen_rtx_LO_SUM(GET_MODE(operands[1]),
operands[0], operands[1]));
  DONE;
}
  })
/code---

With the corresponding define_insns:

-code---
(define_insn "movsi_high"
  [(set (match_operand:SI 0 "register_operand" "=r")
(high:SI (match_operand:SI 1 "immediate_operand" "i")))]
  ""
  "%0.h = #HI(%1)")

(define_insn "movsi_lo_sum"
  [(set (match_operand:SI 0 "register_operand" "+r")
(lo_sum:SI (match_dup 0)
   (match_operand:SI 1 "immediate_operand" "i")))]
  ""
  "%0.l = #LO(%1)")
/code---

but using this method, I get the following error:

-error---
./libgcc2.c:470: error: unrecognizable insn:
(insn 100 99 86 0 ./libgcc2.c:464 (set (reg:SI 10 r10)
(lo_sum (reg:SI 10 r10)
(const_int 65536 [0x1]))) -1 (nil)
(nil))
./libgcc2.c:470: internal compiler error: in extract_insn, at
recog.c:2083
/error---

Why would that RTL not match my movsi_lo_sum define_insn?

I also tried using a define_split:

-code---
(define_split
  [(set (match_operand:SI 0 "register_operand" "")
(match_operand:SI 1 "s17to32_const_int_operand" ""))]
  "reload_completed"
  [(set (match_dup 0)
(high:SI (match_dup 1)))
   (set (match_dup 0)
(lo_sum:SI (match_dup 0)
(match_dup 1)))]
  "")
/code---

along with the same define_insns, but then I get the following error:

-error---
./crtstuff.c:288: error: insn does not satisfy its constraints:
(insn 103 12 11 0 (set (reg:SI 0 r0)
(symbol_ref/u:SI ("*.LC0") [flags 0x2])) 18 {movsi_real} (nil)
(nil))
./crtstuff.c:288: internal compiler error: in
reload_cse_simplify_operands, at postreload.c:378
/error---

In other words, that RTL never matches my define_split, even though I
placed it before the more general movsi define_insn and
s17to32_const_int_operand should return 1 for a symbol_ref.

Do you have any idea why either of these attempts do not work?  Which
method do you think is better?  In case you were wondering, here is the
code for s17to32_const_int_operand.  I modified the function
int_2word_operand from the frv port.

-code---
int s17to32_const_int_operand(rtx op, enum machine_mode mode
ATTRIBUTE_UNUSED)
{
  HOST_WIDE_INT value;
  REAL_VALUE_TYPE rv;
  long l;

  switch (GET_CODE (op))
{
default:
  break;

case LABEL_REF:
case SYMBOL_REF:
case CONST:
  return 1;

case CONST_INT:
  return ! IN_RANGE_P (INTVAL (op), -0x8000, 0x7FFF);

case CONST_DOUBLE:
  if (GET_MODE (op) == SFmode)
{
  REAL_VALUE_FROM_CONST_DOUBLE (rv, op);
  REAL_VALUE_TO_TARGET_SINGLE (rv, l);
  value = l;
  return ! IN_RANGE_P (value, -0x8000, 0x7FFF);
}
  else if (GET_MODE (op) == VOIDmode)
{
  value = CONST_DOUBLE_LOW (op);
  return ! IN_RANGE_P (value, -0x8000, 0x7FFF);
}
  break;
}

  return 0;
}
/code---

Thank you,
Charles J. Tabony



Re: PR22336 (was: Problem with tree-ssa-loop-ivopts.c:get_computation-cost)

2005-07-21 Thread Richard Henderson
On Tue, Jul 12, 2005 at 12:02:46PM +0200, Laurent GUERBY wrote:
>   PR tree-optimization/22336
>   * function.c (record_block_change): Check for 
>   cfun->ib_boundaries_block.

Ok.  I don't see that we're going to get anything cleaner for 4.1.


r~


Problem compiling libstdc++ is current 4.0.2 cvs (volatile strikes again)

2005-07-21 Thread Kean Johnston

Hi all,

I hope someone can help me. I am C++ impaired, and I am getting
the following error when trying to bootstrap the current 4.0.2
CVS. The error is coming from include/ext/bitmap_allocator.h
line 111. The relevant code snippet is:

class _Mutex {
  __gthread_mutex_t _M_mut;

  // Prevent Copying and assignment.
  _Mutex(_Mutex const&);
  _Mutex& operator=(_Mutex const&);

 public:
  _Mutex()
  {
if (__threads_enabled)
  {
#if !defined __GTHREAD_MUTEX_INIT
_GTHREAD_MUTEX_INIT_FUNCTION(&_M_mut);
#else
__gthread_mutex_t __mtemp = __GTHREAD_MUTEX_INIT;
_M_mut = __mtemp; THIS CAUSES THE ERROR
#endif
  }
  }

I get the following error message from the compiler:
error: no match for 'operator=' in 
'((__gnu_cxx::_Mutex)this)->__gnu_cxx::_Mutex::_M_Mut = __mtemp'


*/gcc/include/sys/types.h:678: note: candidates are: __pthread_mutex& 
__pthread_mutex::operator=(const __pthread_mutex&)


The contents of sys/types.h at that location are:
typedef volatile struct __pthread_mutex {
   mutex_t  __pt_mutex_mutex;
   pid_t__pt_mutex_pid;
   thread_t __pt_mutex_owner;
   int  __pt_mutex_depth;
   pthread_mutexattr_t  __pt_mutex_attr;
} pthread_mutex_t;

If I remove the 'volatile' keyword, then everything just works.
So, do I adjust fixincludes to remove the 'volatile' keyword,
or is this some weird side effect of the recent discussions on
volatile (which I didn't read).

Any help / advice appreciated.

Oh PS ... if I change that from a simple assignment to:
  __builtin_memcpy((void *)&_M_mut, (const void *)&__mtemp, 
sizeof(__gthread_mutex_t));

Then it also just works. I could of course adjust the header file
to do that for the platform.

Kean


Re: -fprofile-generate and -fprofile-use

2005-07-21 Thread girish vaitheeswaran
I am using -O3. This is the only flag apart from the
profile flag -fprofile-use.

I had independently tried -march=pentium4 and that did
not buy any performance for this app.

-girish

--- Kelley Cook <[EMAIL PROTECTED]> wrote:

> > I started with a clean slate in my build
> environment 
> > and did not have any residual files hanging
> around. 
> > Are the steps I have indicated in my earlier email
> 
> > correct. Is there a way I can break down the
> problem 
> > into a smaller sub-set of flags and eliminate the
> flag 
> > causing the performance problem. What I mean is
> since 
> > -fprofile-generate and -fprofile-use enable a
> bunch of 
> > flags, would it make sense to avoid profiling and
> try 
> > out some of the individual flags on a trial and
> error 
> > basis. If so what would be the flags to start the 
> > trials with.
> > 
> > -girish
> 
> Before we go any farther, are you sure that you are
> also turning on optimization with -fprofile-generate
> and -fprofile-use?
> 
> In other words, you aren't just using "gcc
> -fprofile-generate xxx.c" to create your object
> files are you?
> 
> You need to use something like "gcc -O2
> -march=pentium4 -fprofile-generate" as unoptimized
> profiles are pretty pointless.
> 
> Instead of general terms, specific examples would
> help a lot.  Like a link to your code that is having
> problems.
> 
> Kelley Cook
> 
> 



Re: Add clog10 to builtins.def, round 2

2005-07-21 Thread Richard Henderson
On Tue, Jul 19, 2005 at 10:46:34AM +0200, FX Coudert wrote:
>   * builtins.def: Add DEF_EXT_C99RES_BUILTIN to define builtins
>   that C99 reserve for future use. Use it to define clog10, clog10f
>   and clog10l.

Ok.


r~


Re: -malign-double vs __alignof__(double)

2005-07-21 Thread Richard Henderson
On Wed, Jul 20, 2005 at 06:26:03PM -0700, Dale Johannesen wrote:
> alignof doc has so many qualifications I'm not sure exactly what it's 
> supposed to do.

__alignof__(double) == 8 on x86, regardless of command line
switches, because 8 is the *preferred* alignment of the type.

That's the weasel word in the docs.  The implementation is ok.



r~


Re: PING [4.1 regression, patch] build i686-pc-mingw32

2005-07-21 Thread Ross Ridge
Ross Ridge wrote:
> I don't see how the existance of configure changes the fact the GCC
> compiler driver exists,
 
DJ Delorie wrote:
> At the time you're running configure, the gcc driver does *not* exist,
> but you *do* need to run as and ld to test what features they support,
> information which is needed in order to *build* gcc.

I don't see the relevence to problem at hand.  The Makefile that contains
the currect hack that's causing the problem doesn't exist at configure
time either.  If your proposed solution of creating new programs to
execute as and ld were implemented then these new programs would also
not exist at configure time.  It's not a problem in configure that's
causing the bootstrap failure, it's a bug in the Makefile.

Ross Ridge



Re: -fprofile-generate and -fprofile-use

2005-07-21 Thread Kelley Cook
I started with a clean slate in my build environment 
and did not have any residual files hanging around. 
Are the steps I have indicated in my earlier email 
correct. Is there a way I can break down the problem 
into a smaller sub-set of flags and eliminate the flag 
causing the performance problem. What I mean is since 
-fprofile-generate and -fprofile-use enable a bunch of 
flags, would it make sense to avoid profiling and try 
out some of the individual flags on a trial and error 
basis. If so what would be the flags to start the 
trials with.


-girish


Before we go any farther, are you sure that you are also turning on 
optimization with -fprofile-generate and -fprofile-use?

In other words, you aren't just using "gcc -fprofile-generate xxx.c" to create 
your object files are you?

You need to use something like "gcc -O2 -march=pentium4 -fprofile-generate" as 
unoptimized profiles are pretty pointless.

Instead of general terms, specific examples would help a lot.  Like a link to 
your code that is having problems.

Kelley Cook



Re: PING [4.1 regression, patch] build i686-pc-mingw32

2005-07-21 Thread DJ Delorie

> I don't see how the existance of configure changes the fact the GCC
> compiler driver exists,

At the time you're running configure, the gcc driver does *not* exist,
but you *do* need to run as and ld to test what features they support,
information which is needed in order to *build* gcc.

It's a chicken and egg type problem.


Re: PING [4.1 regression, patch] build i686-pc-mingw32

2005-07-21 Thread Ross Ridge
Ross Ridge wrote:
> You already have a not-so-small C program that's supposed to know
> where as and ld are.

DJ Delorie wrote:
> You're forgetting about configure.

I don't see how the existance of configure changes the fact the GCC
compiler driver exists, is capable of running and as and ld, and is
supposed to know where they are.  It even does PATH-like searches.
Why not just fix it so it runs the correct version of as and ld directly
during the bootstrap process?  Add an a "--use-bootstrap-binutils" flag
or a "--with-ld=" flag to the compiler driver, that way the newly
built compiler driver can directly execute the version of ld and as the
current makefile hack is trying to force it into running.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  [EMAIL PROTECTED]
-()-/()/  http://www.csclub.uwaterloo.ca/u/rridge/ 
 db  //   


Re: Building mips cross compiler on mingw

2005-07-21 Thread Dave Murphy

Dave Korn wrote:


I've been having some trouble building gcc 4.0.1 for mips target on a
mingw host
   



 No you aren't.  You're using a modified version of the gcc-4.0.1 sources
and you're targetting PSP.  That may be a MIPS backend, but it's a different
_target_.
 



:) fair enough, the patches are currently minimal for gcc though.


 Hmm.  Perhaps we have HOST_WIDE_INT problems for mingw host here?  If
cfun->machine->frame.var_size was a long long, and HOST_WIDE_INT_PRINT_DEC
for mingw is just "%d" or "%ld" rather than "%lld", that would push an extra
NULL word onto the stack that would be taken as the parameter for %s because
the "%d" wouldn't be advancing the varargs pointer past the whole of the
second format arg...

 


And that's exactly the problem.

in /gcc/config/i386/xm-mingw32.h we have

/* MSVCRT does not support the "ll" format specifier for printing
  "long long" values.  Instead, we use "I64".  */
#define HOST_LONG_LONG_FORMAT "I64"

then in gcc/hwint.h we have

/* The string that should be inserted into a printf style format to
  indicate a "long long" operand.  */
#ifndef HOST_LONG_LONG_FORMAT
#define HOST_LONG_LONG_FORMAT "ll"
#endif


and later in the same file

#if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG
# define HOST_WIDE_INT_PRINT "l"
# define HOST_WIDE_INT_PRINT_C "L"
 /* 'long' might be 32 or 64 bits, and the number of leading zeroes
must be tweaked accordingly.  */
# if HOST_BITS_PER_WIDE_INT == 64
#  define HOST_WIDE_INT_PRINT_DOUBLE_HEX "0x%lx%016lx"
# else
#  define HOST_WIDE_INT_PRINT_DOUBLE_HEX "0x%lx%08lx"
# endif
#else
# define HOST_WIDE_INT_PRINT "ll"
# define HOST_WIDE_INT_PRINT_C "LL"
 /* We can assume that 'long long' is at least 64 bits.  */
# define HOST_WIDE_INT_PRINT_DOUBLE_HEX \
   "0x%" HOST_LONG_LONG_FORMAT "x%016" HOST_LONG_LONG_FORMAT "x"
#endif /* HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG */

so the HOST_WIDE_INT_PRINT ignores the mingw override.

I've currently patched as follows

diff -Naurb gcc-4.0.1/gcc/hwint.h gcc-4.0.1-new/gcc/hwint.h
--- gcc-4.0.1/gcc/hwint.hWed Nov 24 04:31:57 2004
+++ gcc-4.0.1-new/gcc/hwint.hThu Jul 21 14:37:06 2005
@@ -80,7 +80,7 @@
#  define HOST_WIDE_INT_PRINT_DOUBLE_HEX "0x%lx%08lx"
# endif
#else
-# define HOST_WIDE_INT_PRINT "ll"
+# define HOST_WIDE_INT_PRINT HOST_LONG_LONG_FORMAT
# define HOST_WIDE_INT_PRINT_C "LL"
  /* We can assume that 'long long' is at least 64 bits.  */
# define HOST_WIDE_INT_PRINT_DOUBLE_HEX \

This works for me and allows the build to complete. I'm not currently 
sure if HOST_WIDE_INT_PRINT_C needs similar treatment.


I've created PR22594 - http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22594

Many thanks.

Dave


The Linux binutils 2.16.91.0.2 is released

2005-07-21 Thread H. J. Lu
This is the beta release of binutils 2.16.91.0.2 for Linux, which is
based on binutils 2005 0720 in CVS on sources.redhat.com plus various
changes. It is purely for Linux.

The new i386/x86_64 assemblers no longer accept instructions for moving
between a segment register and a 32bit memory location, i.e.,

movl (%eax),%ds
movl %ds,(%eax)

To generate instructions for moving between a segment register and a
16bit memory location without the 16bit operand size prefix, 0x66,

mov (%eax),%ds
mov %ds,(%eax)

should be used. It will work with both new and old assemblers. The
assembler starting from 2.16.90.0.1 will also support

movw (%eax),%ds
movw %ds,(%eax)

without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are
available at

http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch
http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch

The ia64 assembler is now defaulted to tune for Itanium 2 processors.
To build a kernel for Itanium 1 processors, you will need to add

ifeq ($(CONFIG_ITANIUM),y)
CFLAGS += -Wa,-mtune=itanium1
AFLAGS += -Wa,-mtune=itanium1
endif

to arch/ia64/Makefile in your kernel source tree.

Please report any bugs related to binutils 2.16.91.0.2 to [EMAIL PROTECTED]

and

http://www.sourceware.org/bugzilla/

If you don't use

# rpmbuild -ta binutils-xx.xx.xx.xx.xx.tar.bz2

to compile the Linux binutils, please read patches/README in source
tree to apply Linux patches if there are any.

Changes from binutils 2.16.91.0.1:

1. Update from binutils 2005 0720.
2. Add Intel VMX support.
3. Add AMD SVME support.
4. Add x86-64 new relocations for medium model.
5. Fix a PIE regression (PR 975).
6. Fix an x86_64 signed 32bit displacement regression.
7. Fix PPC PLT (PR 1004). 
8. Improve empty section removal.

Changes from binutils 2.16.90.0.3:

1. Update from binutils 2005 0622.
2. Fix a linker versioning bug exposed by gcc 4 (PR 1022/1023/1025).
3. Optimize ia64 br->brl relaxation (PR 834).
4. Improve linker empty section removal.
5. Fix DWARF 2 line number reporting (PR 990).
6. Fix DWARF 2 line number reporting regression on assembly file (PR
1000).

Changes from binutils 2.16.90.0.2:

1. Update from binutils 2005 0510.
2. Update ia64 assembler to support comdat group section generated by
gcc 4 (PR 940).
3. Fix a linker crash on bad input (PR 939).
4. Fix a sh64 assembler regression (PR 936).
5. Support linker script on executable (PR 882).
6. Fix the linker -pie regression (PR 878).
7. Fix an x86_64 disassembler bug (PR 843).
8. Fix a PPC linker regression.
9. Misc speed up.

Changes from binutils 2.16.90.0.1:

1. Update from binutils 2005 0429.
2. Fix an ELF linker regression (PR 815).
3. Fix an empty section removal related bug.
4. Fix an ia64 linker regression (PR 855).
5. Don't allow local symbol to be equated common/undefined symbols (PR
857).
6. Fix the ia64 linker to handle local dynamic symbol error reporting.
7. Make non-debugging reference to discarded section an error (PR 858).
8. Support Sparc/TLS.
9. Support rpm build with newer rpm.
10. Fix an alpha linker regression.
11. Fix the non-gcc build regression.

Changes from binutils 2.15.94.0.2.2:

1. Update from binutils 2005 0408.
2. The i386/x86_64 assemblers no longer accept instructions for moving
between a segment register and a 32bit memory location.
3. The x86_64 assembler now allows movq between a segment register and
a 64bit general purpose register.
4. 20x Speed up linker for input files with >64K sections.
5. Properly report ia64 linker relaxation failures.
6. Support tuning ia64 assembler for Itanium 2 processors.
7. Linker will remove empty unused output sections.
8. Add -N to readelf to display full section names.
9. Fix the ia64 linker to support linkonce text sections without unwind
sections.
10. More unwind directive checkings in the ia64 assembler.
11. Speed up linker with wildcard handling.
12. Fix readelf to properly dump .debug_ranges and .debug_loc sections.

Changes from binutils 2.15.94.0.2:

1. Fix greater than 64K section support in linker.
2. Properly handle i386 and x86_64 protected symbols in linker.
3. Fix readelf for LEB128 on 64bit hosts.
4. Speed up readelf for section group process.
5. Include ia64 texinfo pages.
6. Change ia64 assembler to check hint.b for Montecito.
7. Improve relaxation failure report in ia64 linker.
8. Fix ia64 linker to allow relax backward branch in the same section.

Changes from binutils 2.15.94.0.1:

1. Update from binutils 2004 1220.
2. Fix strip for TLS symbol references.

Changes from binutils 2.15.92.0.2:

1. Update from binutils 2004 1121.
2. Put ia64 .ctors/.dtors sections next to small data section for
Intel ia64 compiler.
3. Fix -Bdynamic/-Bstatic handling for linker script.
4. Provide more information on relocation overflow.
5. Add --sort-section to linker.
6. Support icc 8.1 unwind info in readelf.
7. Fix the infinite loop bug on bad input in the ia64 assembler.
8.

Re: PING [4.1 regression, patch] build i686-pc-mingw32

2005-07-21 Thread DJ Delorie

> You already have a not-so-small C program that's supposed to know
> where as and ld are.

You're forgetting about configure.


Re: Merged CVS repository of gcc and old-gcc

2005-07-21 Thread Joseph S. Myers
On Thu, 21 Jul 2005, Daniel Berlin wrote:

> > What will happen to the (revision number based) hyperlinks to patches
> > in Bugzilla and the gcc-cvs mailing list archive like the following:
> > 
> > http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/reg-stack.c.diff?cvsroot=gcc&r1=1.188&r2=1.189
> 
> > Will they still point to something useful?
> 
> We will keep a read-only version of the cvs repository around along with
> cvsweb so that the links still work.

And so people can continue to run "cvs diff" on existing modified CVS 
trees.  I.e. the changes which affect revision numbers will be made on a 
separate copy of the repository which is only used to be fed into the 
conversion; the copy available through cvsweb and existing CVS trees keeps 
its present structure with separate old-gcc.

-- 
Joseph S. Myers   http://www.srcf.ucam.org/~jsm28/gcc/
[EMAIL PROTECTED] (personal mail)
[EMAIL PROTECTED] (CodeSourcery mail)
[EMAIL PROTECTED] (Bugzilla assignments and CCs)


Re: -fprofile-generate and -fprofile-use

2005-07-21 Thread Jan Hubicka
> I started with a clean slate in my build environment
> and did not have any residual files hanging around.
> Are the steps I have indicated in my earlier email
> correct. Is there a way I can break down the problem
> into a smaller sub-set of flags and eliminate the flag
> causing the performance problem. What I mean is since
> -fprofile-generate and -fprofile-use enable a bunch of
> flags, would it make sense to avoid profiling and try
> out some of the individual flags on a trial and error
> basis. If so what would be the flags to start the
It would be probably better to just turn off the individual
optimizations with -fprofile-use (for optimizations that are implied by
this flag there should be no need to re-profile each time).
If you can find particular optimization that gets out of control, it
would be lot easier to fix it...

Honza
> trials with.
> 
> -girish 
> 
> --- Jan Hubicka <[EMAIL PROTECTED]> wrote:
> 
> > > On Wed, Jul 20, 2005 at 10:45:01AM -0700, girish
> > vaitheeswaran wrote:
> > > > > --- Steven Bosscher <[EMAIL PROTECTED]> wrote:
> > > > > 
> > > > > > On Wednesday 20 July 2005 18:53, girish
> > vaitheeswaran wrote:
> > > > > > > I am seeing a 20% slowdown with feedback
> > optimization.
> > > > > > > Does anyone have any thoughts on this.
> > > > > > 
> > > > > > My first thought is that you should probably
> > first
> > > > > > tell what compiler
> > > > > > you are using.
> > > >
> > > > I am using gcc 3.4.3
> > > > -girish
> > > 
> > > Which platform?  I've seen slower code for
> > profile-directed optimizations
> > > on powerpc64-linux with GCC 4.0 and mainline. 
> > It's a bug, but I haven't
> > > looked into it enough to provide a small test case
> > for a problem report.
> > 
> > Actually I would be very interested in seeing
> > testcases such as those.
> > (and the Girish' slowdown too if possible).  In
> > general some slowdowns
> > in side corners are probably unavoidable but both
> > 3.4.3 and 4.0 seems to
> > have pretty consistent improvements with profiling
> > at least for SPEC and
> > i386 I am testing pretty regularly.
> > Such slodowns usually indicate problems like
> > incorrectly updated profile
> > or incorrectly readed in profile because of
> > missmatch in CFGs in between
> > profile and feedback run that are rather dificult to
> > notice and hunt
> > down...
> > 
> > Honza
> > > 
> > > Janis
> > 


Re: Merged CVS repository of gcc and old-gcc

2005-07-21 Thread Daniel Berlin
On Thu, 2005-07-21 at 14:51 +0200, Volker Reichelt wrote:
> Ian Lance Taylor wrote in http://gcc.gnu.org/ml/gcc/2005-07/msg00625.html:
> 
> > In preparation for the future transition to subversion, I've written
> > some code to merge the old-gcc repository into current mainline.  I
> > would like to see this merged repository used as the basis for the
> > conversion to subversion.  The advantage is that it provides revision
> > history back to 1992, when the gcc sources were first put into a
> > source code control system.  (At the time, it was RCS.  Before 1992
> > the source code control system was emacs numbered backup files.)
> > 
> > Since I just wrote this code, I'd like any feedback that people care
> > to give on the correctness and usability of the generated repository.
> > People with SSH access to sourceware should be able to access the
> > temporary merged repository by doing
> > cvs -d :ext:gcc.gnu.org:/pool/ian/repo co gcc
> 
> [snip]
> 
> > By the way, in case anybody asks, I will not be doing this merge
> > before the subversion conversion, because it changes all the CVS
> > revision numbers and thus breaks all existing working directories.
> 
> What will happen to the (revision number based) hyperlinks to patches
> in Bugzilla and the gcc-cvs mailing list archive like the following:
> 
> http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/reg-stack.c.diff?cvsroot=gcc&r1=1.188&r2=1.189

> Will they still point to something useful?

We will keep a read-only version of the cvs repository around along with
cvsweb so that the links still work.

At least, that is the current plan.




Re: Merged CVS repository of gcc and old-gcc

2005-07-21 Thread Volker Reichelt
Ian Lance Taylor wrote in http://gcc.gnu.org/ml/gcc/2005-07/msg00625.html:

> In preparation for the future transition to subversion, I've written
> some code to merge the old-gcc repository into current mainline.  I
> would like to see this merged repository used as the basis for the
> conversion to subversion.  The advantage is that it provides revision
> history back to 1992, when the gcc sources were first put into a
> source code control system.  (At the time, it was RCS.  Before 1992
> the source code control system was emacs numbered backup files.)
> 
> Since I just wrote this code, I'd like any feedback that people care
> to give on the correctness and usability of the generated repository.
> People with SSH access to sourceware should be able to access the
> temporary merged repository by doing
> cvs -d :ext:gcc.gnu.org:/pool/ian/repo co gcc

[snip]

> By the way, in case anybody asks, I will not be doing this merge
> before the subversion conversion, because it changes all the CVS
> revision numbers and thus breaks all existing working directories.

What will happen to the (revision number based) hyperlinks to patches
in Bugzilla and the gcc-cvs mailing list archive like the following:

http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/reg-stack.c.diff?cvsroot=gcc&r1=1.188&r2=1.189

Will they still point to something useful?
If not, that would render the whole gcc-cvs archive list useless. :-(

Well, this is a question for the cvs to svn conversion in general,
but gets more complicated with your merge of the cvs repositories.

Regards,
Volker




Re: Needs advises on rotating register allocation for IA64 in GCC

2005-07-21 Thread Andrey Belevantsev

Steven Bosscher wrote:

Hmm, I've never seen any discussions about this on [EMAIL PROTECTED]  Could you
give some links to messages in the mailing list archives that you
may have found?


I've seen only the thread mentioning the work of Ritu Sabharwal 
(http://gcc.gnu.org/ml/gcc/2002-12/msg00508.html), and then questions of 
Canqun Yang and Feng Wang 
(http://gcc.gnu.org/ml/gcc/2003-09/msg00924.html and 
http://gcc.gnu.org/ml/gcc/2004-10/msg01193.html, respectively). Maybe 
I've missed something.


Andrey




on define_peephole2

2005-07-21 Thread Liu Haibin
Hi, 

I have a problem on the define_peephole2. In nois2.md, there's such a
define_insn

(define_insn "addsi3"
  [(set (match_operand:SI 0 "register_operand"  "=r,r")
(plus:SI (match_operand:SI 1 "register_operand" "%r,r")
 (match_operand:SI 2 "arith_operand" "r,I")))]
  ""
  "add%i2\\t%0, %1, %z2"
  [(set_attr "type" "alu")])

I defined a peephole2 to replace this instruction.

(define_peephole2
  [(set (match_operand:SI 0 "register_operand" "=r")
(plus:SI (match_operand:SI 1 "register_operand" "%r")
;(match_operand:SI 2 "arith_operand" "r")))]
 (match_operand:SI 2 "register_operand" "r")))]
  ""
  [(set (match_operand:SI 0 "register_operand" "=r")
(unspec_volatile:SI [(match_operand:SI 4 "custom_insn_opcode" "N")
(match_operand:SI 1 "register_operand" "r")
(match_operand:SI 2 "register_operand" "r")] 
CUSTOM_INII))]
  "
{
operands[4] = const0_rtx;
}")

Because the operand 2 in the replacing instruction must be a register,
I changed the "arith_operand" to "register_operand", hoping that it
only replaces something like, add r1, r2, r3 instead of addi r1, r2, 9

I did a test with a file, which contains

(insn/f 106 73 107 0 0x0 (set:SI (reg/f:SI 27 sp)
(plus:SI (reg/f:SI 27 sp)
(const_int -16 [0xfff0]))) -1 (nil)
(nil))

and it seems that it did try to replace it with the new instruct. And
I got the following error:

isqrt.c:65: error: unrecognizable insn:
(insn 123 73 107 0 0x0 (set (reg/f:SI 27 sp)
(unspec_volatile:SI [
(const_int 0 [0x0])
(reg/f:SI 27 sp)
(const_int -16 [0xfff0])
] 117)) -1 (nil)
(nil))
isqrt.c:65: internal compiler error: in extract_insn, at recog.c:2175

Any ideas why it still tries to replace it even when it's obviously
not a register (const_int -16)? Thanks.


Regards,
Timothy


Re: Needs advises on rotating register allocation for IA64 in GCC

2005-07-21 Thread Steven Bosscher
On Thursday 21 July 2005 11:02, 李春江 wrote:
> Hi, all
> Nowadays, I plan to add rotating register allocation for IA64 to the SMS
> pass of GCC. From the maillist of gcc@gcc.gnu.org, I found some discussions
> about this topic. But, it was not very clearly discussed about details of
> howto.

Hmm, I've never seen any discussions about this on [EMAIL PROTECTED]  Could you
give some links to messages in the mailing list archives that you
may have found?

Gr.
Steven




Needs advises on rotating register allocation for IA64 in GCC

2005-07-21 Thread 李春江
Hi, all
Nowadays, I plan to add rotating register allocation for IA64 to the SMS pass 
of GCC.
From the maillist of gcc@gcc.gnu.org, I found some discussions about this 
topic. 
But, it was not very clearly discussed about details of howto.
Needs suggestions. 

Best regards.

Chunjiang Li






==
263电子邮件-信赖邮自专业

GCC Summit 2005 Proceedings

2005-07-21 Thread Ranjit Mathew
Hi,

  Can someone please upload the individual papers from this
year's GCC Summit to:

  ftp://gcc.gnu.org/pub/gcc/summit/

The big PDF in:

  http://www.gccsummit.org/2005/2005-GCC-Summit-Proceedings.pdf

is a bit unwieldy to read if one is interested in only
reading a few of the papers.

Thanks,
Ranjit.

-- 
Ranjit Mathew  Email: rmathew AT gmail DOT com

Bangalore, INDIA.Web: http://ranjitmathew.hostingzero.com/



Linking order

2005-07-21 Thread Sampath Kumar Herga
Hi,

I had a question regarding global symbols and linking order. Our project
has a lot of global symbols which are kind of interdependent on each
other. These are in different source files. So if the order in which
they are initialized is not correct, then the process fails to come up. 

One option is to move all the global initializations to one source file,
so that the order of initialization is guaranteed. The second option is
to specify the linker to pick up the source files in specific order,
(either passing them in the appropriate order to ld or using a linker
script).

Which is a better option to go to and how safe is it to depend on the
linker order. 

Thanks,
Sampath.