A question about macro replacement

2007-02-07 Thread zhao_bingfeng
Hi, all,

With following code:
[CODE]
struct B {
int c;
int d;
};

#define X(a, b, c) \
do\
{\
if (a)\
printf("%d, %d\n", b.c, c);\
else\
printf("%d\n", c);\
}while(0);
[/CODE]

Why
int d = 24;
X(1, b, d);
can be compiled successfully but
X(1, b, 24);
not.

I cannot find any description about this behavior in C standard.






Re: After GIMPLE...

2007-02-07 Thread Paulo J. Matos

On 2/6/07, Diego Novillo <[EMAIL PROTECTED]> wrote:

Paulo J. Matos wrote on 02/06/07 14:19:

> Why before pass_build_ssa? (version 4.1.1)
>
It depends on the properties your pass requires.  If you ask for
PROP_cfg and PROP_gimple_any then you should schedule it after the CFG
has been built, but if you need PROP_ssa, then you must be after
pass_build_ssa which implies that your pass only gets enabled at -O1+.



Ok, thank you very much.

--
Paulo Jorge Matos - pocm at soton.ac.uk
http://www.personal.soton.ac.uk/pocm
PhD Student @ ECS
University of Southampton, UK


Re: A question about macro replacement

2007-02-07 Thread Brooks Moses

[EMAIL PROTECTED] wrote:

With following code:
[CODE]
struct B {
int c;
int d;
};

#define X(a, b, c) \
do\
{\
if (a)\
printf("%d, %d\n", b.c, c);\
else\
printf("%d\n", c);\
}while(0);
[/CODE]

Why
int d = 24;
X(1, b, d);
can be compiled successfully but
X(1, b, 24);
not.

I cannot find any description about this behavior in C standard.


Well, with the X(1, b, 24) case, the b.c in the first printf line 
becomes b.24, which is obviously a syntax error.


This sort of thing would be a fair bit easier to track down if you 
quoted the error message rather than just saying that it cannot be 
compiled successfully.


- Brooks



Re: False ???noreturn??? function does return warnings

2007-02-07 Thread Sergei Organov
Jan Hubicka <[EMAIL PROTECTED]> writes:
[...]
>> static inline void __attribute__((noreturn)) BUG(void)
>> {
>>  __asm__ __volatile__("trap");
>>  __builtin_unreached();
>
> This is bit dificult to do in general since it introduces new kind of
> control flow construct.  It would be better to express such functions
> explicitely to GCC.

How about

static inline void __attribute__((noreturn)) BUG(void)
{
__asm__ __volatile__ __noreturn__("trap");
}

then ;)

-- Sergei.



which opt. flags go where? - references

2007-02-07 Thread Kenneth Hoste

Hello,

I'm planning to do some research on the optimization flags available  
for GCC (currently, using 4.1.1). More in particular, we want to see  
how we can come up with a set of combinations of flags which allow a  
tradeoff between compilation time, execution time and code size (as  
with -O1, -O2, -O3, -Os). Off course, we don't want to do an  
exhaustive search of all possible combinations of flags, because that  
would be totally unfeasible (using the 56 flags enabled in -O3 for  
gcc 4.1.1 yields ~72*10^15 (= 2^56-1) possible candidates).


It seems there has already been some work done on this subject, or  
atleast that's what richi on #gcc (OFTC) told me. He wasn't able to  
refer me to work in that area though. I have found some references  
myself (partially listed below), but I'm hoping people more familiar  
with the GCC community can help expand this list.


[1] Almagor et al., Finding effective compilation sequences (LCES'04)
[2] Cooper et al., Optimizing for Reduced Code Space using Genetic  
Algorithms (LCTES'99)
[3] Almagor et al., Compilation Order Matters: Exploring the  
Structure of the Space of Compilation Sequences Using Randomized  
Search Algorithms  (Tech.Report)
[3] Acovea: Using Natural Selection to Investigate Software  
Complexities (http://www.coyotegulch.com/products/acovea/)


Some other questions:

* I'm planning to do this work on an x86 platform (i.e. Pentium4),  
but richi told me that's probably not a good idea, because of the low  
number of registers available on x86. Comments?


* Since we have done quite some analysis on the SPEC2k benchmarks,  
we'll also be using them for this work. Other suggestions are highly  
appreciated.


* Since there has been some previous work on this, I wonder why none  
of it has made it into GCC development. Were the methods proposed  
unfeasible for some reason? What would be needed to make an approach  
to automatically find suitable flags for -Ox interesting enough to  
incorporate it into GCC? Any references to this previous work?


greetings,

Kenneth Hoste
Paris, ELIS, Ghent University (Belgium)

--

Statistics are like a bikini. What they reveal is suggestive, but  
what they conceal is vital (Aaron Levenstein)


Kenneth Hoste
ELIS - Ghent University
[EMAIL PROTECTED]
http://www.elis.ugent.be/~kehoste




Re: which opt. flags go where? - references

2007-02-07 Thread Diego Novillo

Kenneth Hoste wrote on 02/07/07 08:56:


[1] Almagor et al., Finding effective compilation sequences (LCES'04)
[2] Cooper et al., Optimizing for Reduced Code Space using Genetic  
Algorithms (LCTES'99)
[3] Almagor et al., Compilation Order Matters: Exploring the  
Structure of the Space of Compilation Sequences Using Randomized  
Search Algorithms  (Tech.Report)
[3] Acovea: Using Natural Selection to Investigate Software  
Complexities (http://www.coyotegulch.com/products/acovea/)


You should also contact Ben Elliston (CC'd) and Grigori Fursin (sorry, 
no email).


Ben worked on dynamic reordering of passes, his thesis will have more 
information about it.


Grigori is working on an API for iterative an adaptive optimization, 
implemented in GCC.  He presented at the last HiPEAC 2007 GCC workshop. 
 Their presentation should be available at http://www.hipeac.net/node/746



Some other questions:

* I'm planning to do this work on an x86 platform (i.e. Pentium4),  
but richi told me that's probably not a good idea, because of the low  
number of registers available on x86. Comments?


When deriving ideal flag combinations for -Ox, we will probably want 
common sets for the more popular architectures, so I would definitely 
include x86.


* Since we have done quite some analysis on the SPEC2k benchmarks,  
we'll also be using them for this work. Other suggestions are highly  
appreciated.


We have a collection of tests from several user communities that we use 
as performance benchmarks (DLV, TRAMP3D, MICO).  There should be links 
to the testers somewhere in http://gcc.gnu.org/


* Since there has been some previous work on this, I wonder why none  
of it has made it into GCC development. Were the methods proposed  
unfeasible for some reason? What would be needed to make an approach  
to automatically find suitable flags for -Ox interesting enough to  
incorporate it into GCC? Any references to this previous work?


It's one of the things I would like to see implemented in GCC in the 
near future.  I've been chatting with Ben and Grigori about their work 
and it would be a great idea if we could discuss this at the next GCC 
Summit.  I'm hoping someone will propose a BoF about it.


Re: 27% regression of gcc 4.3 performance on cpu2k6/calculix

2007-02-07 Thread Vladimir Sysoev

Hi!
I create test to reproduce issue with cpu2006/454.calculix
See attached. File e_c3d.f contains cutted subroutine from calculix.
tr535.f main entry point of the test. you can use go-script as a
reference how i get these results. find_stall.pl script which find
problem instruction combinations.

Problem that new compiler generates read instruction right after
write. See some dumps below.

This is inner cycle near line #42 generated by rev. 119759 compiler
.L13:
.LBB22:
.loc 1 42 0
movapd  %xmm2, %xmm0
leaq(%rdx,%rbx), %rax
.loc 1 38 0
addl$1, %edi
addq$24, %rdx
.loc 1 42 0
mulsd   72(%rcx), %xmm0
.loc 1 38 0
addq$72, %rcx
cmpl$4, %edi
.loc 1 42 0
mulsd   %xmm3, %xmm0
mulsd   -8(%rax,%r9,8), %xmm0
mulsd   %xmm4, %xmm0
addsd   %xmm0, %xmm1
.loc 1 38 0
jne .L13

This is for line 42 generated by rev. 119760 compiler
.L13:
.LBB23:
.loc 1 42 0
movsd   72(%rdx), %xmm0
movq80(%rsp), %rax
addq$72, %rdx
mulsd   -8(%r9,%r15,8), %xmm0
addq%rdi, %rax
addq$24, %rdi
.loc 1 38 0
cmpq$72, %rdi
.loc 1 42 0
mulsd   -8(%r11,%r14,8), %xmm0
mulsd   -8(%rax,%r13,8), %xmm0
movq440(%rsp), %rax
mulsd   (%rax), %xmm0
addsd   (%rsi,%r10,8), %xmm0 <-|
movsd   %xmm0, (%rsi,%r10,8)<-+- problems
.loc 1 38 0
jne .L13



My output is:
real0m3.781s
user0m3.776s
sys 0m0.004s

real0m5.956s
user0m5.948s
sys 0m0.004s
hey... we are going
hey... we are going
Line 31
   addsd   (%rsi,%r10,8), %xmm0
   movsd   %xmm0, (%rsi,%r10,8)

Line 42
   addsd   (%rsi,%r10,8), %xmm0
   movsd   %xmm0, (%rsi,%r10,8)

Feel free to ask if any problems with reproducing occurs.

-Vladimir


--
   * From: Grigory Zagorodnev 
   * To: gcc at gcc dot gnu dot org, dnovillo at redhat dot com
   * Cc: "H. J. Lu" 
   * Date: Mon, 15 Jan 2007 17:59:31 +0300
   * Subject: 27% regression of gcc 4.3 performance on cpu2k6/calculix

Hi!
There is a huge regression of gcc 4.3 performance detected on
cpu2006/454.calculix benchmark at -O2 optimization level on
x86_64-redhat-linux.

Regression is caused by mem-ssa merge 12/12/2006 (revision 119760).
http://gcc.gnu.org/viewcvs?view=rev&revision=119760


PS: I'm trying to get a small reproducer
- Grigory


test_calculix.tar.bz2
Description: BZip2 compressed data


Re: which opt. flags go where? - references

2007-02-07 Thread Kenneth Hoste

Hi,

On 07 Feb 2007, at 15:22, Diego Novillo wrote:


Kenneth Hoste wrote on 02/07/07 08:56:


[1] Almagor et al., Finding effective compilation sequences (LCES'04)
[2] Cooper et al., Optimizing for Reduced Code Space using  
Genetic  Algorithms (LCTES'99)
[3] Almagor et al., Compilation Order Matters: Exploring the   
Structure of the Space of Compilation Sequences Using Randomized   
Search Algorithms  (Tech.Report)
[3] Acovea: Using Natural Selection to Investigate Software   
Complexities (http://www.coyotegulch.com/products/acovea/)
You should also contact Ben Elliston (CC'd) and Grigori Fursin  
(sorry, no email).


Ben worked on dynamic reordering of passes, his thesis will have  
more information about it.


Grigori is working on an API for iterative an adaptive  
optimization, implemented in GCC.  He presented at the last HiPEAC  
2007 GCC workshop.  Their presentation should be

available at http://www.hipeac.net/node/746



I actually talked to Grigori about the -Ox flags, I was at the HiPEAC  
conference too ;-) I didn't include references to his work, because  
my aim wouldn't be at reordering of passes, but just selecting them.  
I understand that reordering is of great importance while optimizing,  
but I think this project is big enough as is.



Some other questions:
* I'm planning to do this work on an x86 platform (i.e.  
Pentium4),  but richi told me that's probably not a good idea,  
because of the low  number of registers available on x86. Comments?
When deriving ideal flag combinations for -Ox, we will probably  
want common sets for the more popular architectures, so I would  
definitely include x86.


OK. I think richi's comment on x86 was the fact that evaluating the  
technique we are thinking about might produce results which are hard  
to 'port' to a different architecture. But then again, we won't be  
stating we have found _the_ best set of flags for a given goal...  
Thank you for your comment.




* Since we have done quite some analysis on the SPEC2k  
benchmarks,  we'll also be using them for this work. Other  
suggestions are highly  appreciated.
We have a collection of tests from several user communities that we  
use as performance benchmarks (DLV, TRAMP3D, MICO).  There should  
be links to the testers somewhere in http://gcc.gnu.org/


OK, sounds interesting, I'll look into it. In which way are these  
benchmarks used? Just to test the general performance of GCC? Have  
they been compared to say, SPEC CPU, or other 'research/industrial'  
benchmark suites (such as MiBench, MediaBench, EEMBC, ...) ?




* Since there has been some previous work on this, I wonder why  
none  of it has made it into GCC development. Were the methods  
proposed  unfeasible for some reason? What would be needed to make  
an approach  to automatically find suitable flags for -Ox  
interesting enough to  incorporate it into GCC? Any references to  
this previous work?
It's one of the things I would like to see implemented in GCC in  
the near future.  I've been chatting with Ben and Grigori about  
their work and it would be a great idea if we could discuss this at  
the next GCC Summit.  I'm hoping someone will propose a BoF about it.


I'm hoping my ideas will lead to significant results, because I think  
this is an important issue.


greetings,

Kenneth

--

Statistics are like a bikini. What they reveal is suggestive, but  
what they conceal is vital (Aaron Levenstein)


Kenneth Hoste
ELIS - Ghent University
[EMAIL PROTECTED]
http://www.elis.ugent.be/~kehoste




Bug in value-prof.c:visit_hist

2007-02-07 Thread Robert Kidd
There appears to be a bug in value-prof.c:visit_hist rev 121554.   
This function always returns 0, which causes htab_traverse to exit  
early.  This means that only the first histogram that appears in cfun- 
>value_histograms->entries is ever checked, so verify_histograms  
will only indicate an error if the first histogram is unreachable.   
The attached patch changes the return value to ensure that all  
histograms are checked.  This patch bootstraps and passes make check  
on x86_64.


Robert Kidd
[EMAIL PROTECTED]

Index: gcc/value-prof.c
===
--- gcc/value-prof.c(revision 121671)
+++ gcc/value-prof.c(working copy)
@@ -353,8 +353,9 @@ visit_hist (void **slot, void *data)
   dump_histogram_value (stderr, hist);
   debug_generic_stmt (hist->hvalue.stmt);
   error_found = true;
+  return 0;
 }
-  return 0;
+  return 1;
 }
 
 /* Verify sanity of the histograms.  */


Re: Bug in value-prof.c:visit_hist

2007-02-07 Thread Eric Botcazou
> This patch bootstraps and passes make check on x86_64.

Please do not cross-post.  Patches should go to gcc-patches@ only.

-- 
Eric Botcazou


Regarding tree traversal

2007-02-07 Thread Prabhanjan Kambadur

I am new to this list, so please excuse any obvious mistakes. I am
trying to check if two types are equal or one is derived from the
other within the compiler. One of the types is a struct that is
defined under the std namescope. How do I search for a "node" that is
a TYPE_DECL of the structure that I want? I would like to search for
the TYPE_DECL of "struct foo" in the tree std_node.

Regards,

Anju

--
This too shall pass


Re: ICE in gcc/libgcc2.c:566 (gcc trunk)

2007-02-07 Thread Hanno Meyer-Thurow
Hi Ian,
sorry to bother again. I reduced the code (attached) that segfaults here
on Core 2 Duo [1]. If I add -fno-split-wide-types the code does not segfault.
That flag comes from your patchset [2].

execute: 
# ./cc1 -quiet -m64 -O1 test.c -o test.o

Any ideas?


Regards,
Hanno

[1] http://gcc.gnu.org/ml/gcc/2007-02/msg00095.html
[2] http://gcc.gnu.org/ml/gcc-patches/2007-02/msg2.html
typedef int TItype __attribute__ ((mode (TI)));
typedef int DItype __attribute__ ((mode (DI)));
typedef unsigned int UDItype __attribute__ ((mode (DI)));

struct DWstruct {DItype low, high;};
typedef union
{
  struct DWstruct s;
  TItype ll;
} DWunion;

TItype
__multi3 (TItype u, TItype v)
{
  const DWunion uu = {.ll = u};
  const DWunion vv = {.ll = v};
  DWunion w = {
.ll = ({
  DWunion __w;
  do {
UDItype __x0, __x1, __x2, __x3;
UDItype __ul, __vl, __uh, __vh;
__ul = ((UDItype) (uu.s.low) & (((UDItype) 1 << ((8 * 8) / 2)) - 1));
__uh = ((UDItype) (uu.s.low) >> ((8 * 8) / 2));
__vl = ((UDItype) (vv.s.low) & (((UDItype) 1 << ((8 * 8) / 2)) - 1));
__vh = ((UDItype) (vv.s.low) >> ((8 * 8) / 2));
__x0 = (UDItype) __ul * __vl;
__x1 = (UDItype) __ul * __vh;
__x2 = (UDItype) __uh * __vl;
__x3 = (UDItype) __uh * __vh;
__x1 += ((UDItype) (__x0) >> ((8 * 8) / 2));
__x1 += __x2; if (__x1 < __x2) __x3 += ((UDItype) 1 << ((8 * 8) / 2));
(__w.s.high) = __x3 + ((UDItype) (__x1) >> ((8 * 8) / 2));
(__w.s.low) = ((UDItype) (__x1) & (((UDItype) 1 << ((8 * 8) / 2)) - 1))
		* ((UDItype) 1 << ((8 * 8) / 2))
		+ ((UDItype) (__x0) & (((UDItype) 1 << ((8 * 8) / 2)) - 1));
  } while (0);
  __w.ll;
}
  )};

  w.s.high += ((UDItype) uu.s.low * (UDItype) vv.s.high
+ (UDItype) uu.s.high * (UDItype) vv.s.low);

  return w.ll;
}


Fw: Scheduling an early complete loop unrolling pass?

2007-02-07 Thread Ayal Zaks

...

>Ah, right... I wonder if we can keep the loop structure in place, even
>after completely unrolling the loop  - I mean the 'struct loop' in
>'current_loops' (not the actual CFG), so that the "SLP in loops" would
have
>a chance to at least consider vectorizing this "loop".

Having a "loop" structure for a piece of CFG that is not a loop, was used
in some other compiler we worked with - the notion of 'region' was such
that it corresponded to loops, and in addition the entire function belonged
to a "universal" region (Peter - please correct if I'm wrong). But I think
you were looking for some marking of a basic block saying "this used to be
a loop but got completely unrolled". I wonder how much such a "dummy" loop
structure can really help the vectorizer, except for (convenience of)
keeping intact the driver that traverses all such structures or the hanging
of additional data off of them.

Ayal.



Re: ICE in gcc/libgcc2.c:566 (gcc trunk)

2007-02-07 Thread Ian Lance Taylor
Hanno Meyer-Thurow <[EMAIL PROTECTED]> writes:

> Hi Ian,
> sorry to bother again. I reduced the code (attached) that segfaults here
> on Core 2 Duo [1]. If I add -fno-split-wide-types the code does not segfault.
> That flag comes from your patchset [2].
> 
> execute: 
> # ./cc1 -quiet -m64 -O1 test.c -o test.o
> 
> Any ideas?

The test case works for me.  Note that I've committed several cleanup
patches for lower-subreg.c over the last several days.  In particular,
do you have this change in your sources?

2007-02-01  Ian Lance Taylor  <[EMAIL PROTECTED]>

* lower-subreg.c (resolve_clobber): Handle a subreg of a concatn.

?

Let me know if you still see the problem with up to date sources.

Ian


Re: ICE in gcc/libgcc2.c:566 (gcc trunk)

2007-02-07 Thread Hanno Meyer-Thurow
On 07 Feb 2007 13:46:43 -0800
Ian Lance Taylor <[EMAIL PROTECTED]> wrote:

>   * lower-subreg.c (resolve_clobber): Handle a subreg of a concatn.

Yes, that is there. I have revision 121690.

Hanno


gcc-4.2-20070207 is now available

2007-02-07 Thread gccadmin
Snapshot gcc-4.2-20070207 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20070207/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.2 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch 
revision 121698

You'll find:

gcc-4.2-20070207.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.2-20070207.tar.bz2 C front end and core compiler

gcc-ada-4.2-20070207.tar.bz2  Ada front end and runtime

gcc-fortran-4.2-20070207.tar.bz2  Fortran front end and runtime

gcc-g++-4.2-20070207.tar.bz2  C++ front end and runtime

gcc-java-4.2-20070207.tar.bz2 Java front end and runtime

gcc-objc-4.2-20070207.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.2-20070207.tar.bz2The GCC testsuite

Diffs from 4.2-20070131 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.2
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Regarding tree traversal

2007-02-07 Thread Mike Stump

On Feb 7, 2007, at 1:05 PM, Prabhanjan Kambadur wrote:

I am trying to check if two types are equal


equal, what's that?  :-)  (That's a joke for the rest of the folks  
here.  See the CANONICAL types work that Doug did recently for some  
of the more recent email threads.)


One of the types is a struct that is defined under the std  
namescope. How do I search for a "node" that is a TYPE_DECL of the  
structure that I want? I would like to search for the TYPE_DECL of  
"struct foo" in the tree std_node.


This process is called lookup.  Glance around at routines like  
lookup_qualified_name,


Don't be afraid to fire up gdb under emacs, set a breakpoint in the  
parser for the construct you're interested in, run the sample  
testcase through it and watch what the compiler does.  It'd take you  
right there.


Re: Regarding tree traversal

2007-02-07 Thread Mike Stump

On Feb 7, 2007, at 1:05 PM, Prabhanjan Kambadur wrote:

I am new to this list, so please excuse any obvious mistakes. I am
trying to check if two types are equal or one is derived from the
other within the compiler. One of the types is a struct that is
defined under the std namescope. How do I search for a "node" that is
a TYPE_DECL of the structure that I want? I would like to search for
the TYPE_DECL of "struct foo" in the tree std_node.


Just to be clear, given "std" and given "foo" you want to find  
std::foo, right?  That's the question I previously answered.


Or, is your question, given the shape expressed by tree bar (a  
TYPE_DECL), find a "foo" with the same shape?


The second question is answered by looping over all the members of  
std, and checking each one for the right shape.


Re: Regarding tree traversal

2007-02-07 Thread Prabhanjan Kambadur

Yup, what you answered is indeed what I want 

Thanks,

Anju


Re: ICE in gcc/libgcc2.c:566 (gcc trunk)

2007-02-07 Thread Ian Lance Taylor
Hanno Meyer-Thurow <[EMAIL PROTECTED]> writes:

> sorry to bother again. I reduced the code (attached) that segfaults here
> on Core 2 Duo [1]. If I add -fno-split-wide-types the code does not segfault.
> That flag comes from your patchset [2].
> 
> execute: 
> # ./cc1 -quiet -m64 -O1 test.c -o test.o
> 
> Any ideas?

I don't know what is causing this.  I just checked again, and it does
not happen for me.

Looking at your backtrace from
http://gcc.gnu.org/ml/gcc/2007-02/msg00095.html
count_pseudo is being called with register 71.  Register 71 no longer
exists; it was split.  That is why you are getting the SIGSEGV.  But
when I run my copy of the compiler, count_pseudo is never called with
register 71.

count_pseudo is being called from this code in order_regs_for_reload:
  EXECUTE_IF_SET_IN_REG_SET
(&chain->live_throughout, FIRST_PSEUDO_REGISTER, i, rsi)
{
  count_pseudo (i);
}
Since register 71 no longer exists, it should not be in
chain->live_throughout.  So why is it set?

I'm not sure what else to say, since I can't recreate the problem
myself.

Can anybody else out there recreate this on their x86_64 system?

Ian


"error: unable to generate reloads for...", any hints?

2007-02-07 Thread 吴曦

Hi,
I am working on gcc 4.1.1 and Itanium architecture. I want to modify
the machine description of ia64.md to add some checks before each ld
instruction. the
following is the original define_insn:

(define_insn "*movqi_internal"
 [(set (match_operand:QI 0 "destination_operand" "=r,r,r, m, r,*f,*f")
 (match_operand:QI 1 "move_operand""rO,J,m,rO,*f,rO,*f"))]
 "ia64_move_ok (operands[0], operands[1])"
 "@
  mov %0 = %r1
  addl %0 = %1, r0
  ld1%O1 %0 = %1%P1
  st1%Q0 %0 = %r1%P0
  getf.sig %0 = %1
  setf.sig %0 = %r1
  mov %0 = %1"
  [(set_attr "itanium_class" "ialu,ialu,ld,st,frfr,tofr,fmisc")])

I observe that there is a ld instruction in 3rd alternative, so I add
a new define_insn before it in the hope that it will be matched
firstly.

(define_insn "*ld_movqi_internal"
 [(set (match_operand:QI 0 "destination_operand" "=r")
 (match_operand:QI 1 "move_operand" "m"))]
 "ia64_move_ok (operands[0], operands[1])
  && flag_check_ld"
  {
printf("define_insn ld_movqi_internal\n");
return "ld1%O1 %0 = %1%P1";
  }
  [(set_attr "itanium_class" "ld")]

I keep every thing the same as 3rd alternative in original define_insn
except using C statement to return the desired output template.
However, when I use the newly builded gcc to compile the following
program, it crashes.

#include 

char characters[8192]={'a',};

int main()
{
char c = characters[0];
printf("Hello World! c:%c\n", c);
}

the error reported is:
hi.c:9: error: unable to generate reloads for:
(insn 10 9 12 1 (set (mem/c/i:QI (reg/f:DI 111 loc79) [0 c+0 S1 A128])
   (reg:QI 14 r14 [orig:342 characters ] [342])) 3
{*gift_movqi_internal_ld} (nil)
   (expr_list:REG_DEAD (reg:QI 14 r14 [orig:342 characters ] [342])
   (nil)))
hi.c:9: internal compiler error: in find_reloads, at reload.c:3738

In IA64, the first pesudo register number is 334, thus register 111
and register 14 are both hardware registers.

I looked at find_reloads at reload.c and find the following code
fragement and comment:

 /* The operands don't meet the constraints.
goal_alternative describes the alternative
that we could reach by reloading the fewest operands.
Reload so as to fit it.  */

 if (best == MAX_RECOG_OPERANDS * 2 + 600)
   {
 /* No alternative works with reloads??  */
 if (insn_code_number >= 0)
   fatal_insn ("unable to generate reloads for:", insn);
 ...

So, what is going on here? Especially, what is find_reloads going to
finish and why it is going wrong here...

I would appreciate any help on this question, thx!

Best Regards

--andy.wu