GCC-4.3.0 fails to compile SPECint-2006 with control speculation on itanium processor

2008-04-11 Thread
Hi:

I am working on gcc-4.3.0 and Redhat ES 4. When I uses the compiler to
build specint-2006 benchmarks,
none passes the make with compiler option: -msched-control-spec
(enable control speculation on IA-64)

Here is part of the error log:

# Error 400.perlbench: Error with make!  #
# Error 401.bzip2: Error with make!  #
# Error 403.gcc: Error with make!#
# Error 429.mcf: Error with make!#
# Error 445.gobmk: Error with make!  #
# Error 456.hmmer: Error with make!  #
# Error 458.sjeng: Error with make!  #
# Error 462.libquantum: Error with make! #
# Error 464.h264ref: Error with make!#
# Error 471.omnetpp: Error with make!#
# Error 473.astar: Error with make!  #
# Error 483.xalancbmk: Error with make!  #

So any help ? Thanks


Fwd: GCC-4.3.0 fails to compile SPECint-2006 with control speculation on itanium processor

2008-04-11 Thread
-- Forwarded message --
From: 吴曦 [EMAIL PROTECTED]
Date: 2008/4/11
Subject: Re: GCC-4.3.0 fails to compile SPECint-2006 with control
speculation on itanium processor
To: Eljay Love-Jensen [EMAIL PROTECTED]


2008/4/11 Eljay Love-Jensen [EMAIL PROTECTED]:
 Hi 吴曦,

 What version of GNU Make are you using?

 make --version

 Is it at least GNU Make 3.80?

 --Eljay

[EMAIL PROTECTED] benchspec]$ make --version
GNU Make 3.80
Copyright (C) 2002  Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.


Fwd: GCC-4.3.0 fails to compile SPECint-2006 with control speculation on itanium processor

2008-04-11 Thread
-- Forwarded message --
From: 吴曦 [EMAIL PROTECTED]
Date: 2008/4/11
Subject: Re: GCC-4.3.0 fails to compile SPECint-2006 with control
speculation on itanium processor
To: Eljay Love-Jensen [EMAIL PROTECTED]


I turn on the verbose mode of spec, it really fails to compile the code.

Something like internal compiler error, etc.

It seems that this support is rather inmature


2008/4/11 吴曦 [EMAIL PROTECTED]:

 2008/4/11 Eljay Love-Jensen [EMAIL PROTECTED]:
  Hi 吴曦,
 
  What version of GNU Make are you using?
 
  make --version
 
  Is it at least GNU Make 3.80?
 
  --Eljay

 [EMAIL PROTECTED] benchspec]$ make --version
 GNU Make 3.80
 Copyright (C) 2002  Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.
 There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
 PARTICULAR PURPOSE.



Re: Scheduling problem - A more detailed explain

2007-10-10 Thread
2007/10/11, Jim Wilson [EMAIL PROTECTED]:
Thanks for you helpful hints !  And I am sorry for such a late reply.
I have figured out this problem yesterday :-).

 Do we know for sure that the scheduler is failing here?  Have you looked
 at -da RTL dumps to verify which pass is performing the incorrect
 optimization?


I use the method you mentioned above to find the problem, the
scheduling code that GCC used is correct, but there are some errors
with the order of my instrumentation INSN list. So...

 Currently, gcc only emits these pr reg group save/restores in the
 prologue and epilogue, and we have scheduling barriers after/before the
 prologue/epilogue, so it is possible that there is a latent problem here
 which has gone unnoticed simply because it is impossible to reproduce
 with unmodified FSF gcc sources.

Previously, I also doubt that it is a latent problem, however, I read
the code you mentioned and find it is correct with pr reg group
save/restore, and finally find it is due to my carelessness in the
instrumentation code; so, sorry for that :-).


Re: Scheduling problem - A more detailed explain

2007-10-08 Thread
rws_access_reg should be handling this correctly. It uses
HARD_REGNO_NREGS to get the number of regs referred to by a reg rtl.
So it should return 64 in this case, and then it will iterate over all
64-bit PR regs when checking for a dependency.

I have found HARD_REGNO_NREGS in ia64.h

#define HARD_REGNO_NREGS(REGNO, MODE)   \
  ((REGNO) == PR_REG (0)  (MODE) == DImode ? 64   \
  

 As you stated above, it returns 64 for DImode pr0.

We already have support for these move instructions. See the
movdi_internal pattern. Since there are 64 1-bit PR registers, we use
a DImode reference to pr0 to represent the entire set of PR registers.
Is this the RTL that you are using? Or do you have your own
representation? If different, what RTL are you using?

I generate this mov instruction like this:
gen_movdi(gen_rtx_REG(DImode, PR_REG(0)), X);
here X is a general register. Further, I dump the rtl list and found
the generated insn, I think it is correct:

(insn 522 551 523 2 (set (reg:DI 8 r8)
(reg:DI 256 p0)) -1 (nil)
(nil))
and

(insn 537 377 535 2 (set (reg:DI 256 p0)
(reg:DI 8 r8)) -1 (nil)
(nil))
but the scheduling really produces wrong code :-

Besides, I am using GCC-4.1.1 and I found rws_access_reg which handles
the dependencies for these mov instructions ...

static int
rws_access_reg (rtx reg, struct reg_flags flags, int pred)
{
  int regno = REGNO (reg);
  int n = HARD_REGNO_NREGS (REGNO (reg), GET_MODE (reg));

  if (n == 1)
return rws_access_regno (regno, flags, pred);
  else
{
  int need_barrier = 0;
  while (--n = 0)
need_barrier |= rws_access_regno (regno + n, flags, pred);
  return need_barrier;
}
}

Well... Is there anything I miss or forget to do ?

Thanks


Question on GGC

2007-09-27 Thread
Hi.
I have several global variables which are of type rtx. They are used
in flow.c ia64.c and final.c. As stated in the internal doc with
types. I add GTY(()) marker after the keyword 'extern'. for example:
 extern GTY(()) rtx a;
these 'extern's are added in regs.h which is included in flow.c ia64.c
and final.c

However, I init 'a' at ia64_compute_frame which is defined in ia64.c
but found 'a' incorrectly collected by ggc_collect. (I watch the
memory location which is allocated for a, and found it is collected by
GGC.

Is there any thing I forget to do ?

Any help is truly appreciated

Thanks :-)


Re: Question on GGC

2007-09-27 Thread
2007/9/27, Zdenek Dvorak [EMAIL PROTECTED]:
 Hello,

  I have several global variables which are of type rtx. They are used
  in flow.c ia64.c and final.c. As stated in the internal doc with
  types. I add GTY(()) marker after the keyword 'extern'. for example:
   extern GTY(()) rtx a;
  these 'extern's are added in regs.h which is included in flow.c ia64.c
  and final.c
 
  However, I init 'a' at ia64_compute_frame which is defined in ia64.c
  but found 'a' incorrectly collected by ggc_collect. (I watch the
  memory location which is allocated for a, and found it is collected by
  GGC.
 
  Is there any thing I forget to do ?

 you need to add regs.h to GTFILES in Makefile.in.

 Zdenek

Thanks. I found GCC generate the file gtype-desc.c to handle these
definitions, But how does GCC find the corresponded definition for
'extern GTY(()) rtx a' ? Once I want to define the variable in a new
file or extern this variable in a new header file, it will throw the
error that 'the initializer is not constant' (I think the error is due
to GCC can not correlate the 'extern' and 'definition'). Would you
give more details on this problem, especially if want to define and
extern this variable in new files.


Re: Question on GGC

2007-09-27 Thread
Sorry, I found it in gccint, thanks :-)

2007/9/28, 吴曦 [EMAIL PROTECTED]:
 2007/9/27, Zdenek Dvorak [EMAIL PROTECTED]:
  Hello,
 
   I have several global variables which are of type rtx. They are used
   in flow.c ia64.c and final.c. As stated in the internal doc with
   types. I add GTY(()) marker after the keyword 'extern'. for example:
extern GTY(()) rtx a;
   these 'extern's are added in regs.h which is included in flow.c ia64.c
   and final.c
  
   However, I init 'a' at ia64_compute_frame which is defined in ia64.c
   but found 'a' incorrectly collected by ggc_collect. (I watch the
   memory location which is allocated for a, and found it is collected by
   GGC.
  
   Is there any thing I forget to do ?
 
  you need to add regs.h to GTFILES in Makefile.in.
 
  Zdenek
 
 Thanks. I found GCC generate the file gtype-desc.c to handle these
 definitions, But how does GCC find the corresponded definition for
 'extern GTY(()) rtx a' ? Once I want to define the variable in a new
 file or extern this variable in a new header file, it will throw the
 error that 'the initializer is not constant' (I think the error is due
 to GCC can not correlate the 'extern' and 'definition'). Would you
 give more details on this problem, especially if want to define and
 extern this variable in new files.



Re: support single predicate set instructions in GCC-4.1.1

2007-09-26 Thread
2007/9/26, Jim Wilson [EMAIL PROTECTED]:
 On Tue, 2007-09-25 at 15:13 +0800, 吴曦 wrote:
  propagate_one_insn), I don't understand why GCC fails the computation
  of liveness if there is no optimization flag :-(.

 There is probably something else happening with -O that is recomputing
 some liveness or CFG info.  For instance, the flow2 pass will call
 split_all_insns and cleanup_cfg, but only with -O.  You could try
 selectively disabling other optimization passes to determine which one
 is necessary in order for your code to work.  Actually, looking closer,
 I see several of them call update_life_info.  regrename for instance has
 two update_life_info calls.

 Another possibility here is to try calling recompute_reg_usage instead
 of doing it yourself.  Or maybe calling just update_life_info directly,
 if you need different flags set.

 FYI This stuff is all different on mainline since the dataflow merge.
 I'm assuming you are using gcc-4.2.x.
 --
 Jim Wilson, GNU Tools Support, http://www.specifix.com



Thanks, it's the problem of pass_stack_adjustments.


Re: support single predicate set instructions in GCC-4.1.1

2007-09-25 Thread
2007/9/25, Jim Wilson [EMAIL PROTECTED]:
 ÎâêØ wrote:
  (define_insn *shift_predicate_cmp
[(set (const_int 0)
  (and:BI (and:BI (match_operand:BI 1 register_operand c)
(and:BI (match_operand:DI 2 gr_reg_or_8bit_adjusted_operand 
  rL)
   (match_operand:DI 3 gr_register_operand r)))
(match_operand:BI 0 register_operand c)))]

(%0) cmp.ne %1, p0 = %2, %3
[(set_attr itanium_class icmp)])
  it warns WAW and there should be stop ;; between these two
  instructions.

 It is the assembler that is giving the warning.  The assembler knows
 that the %1 operand is modified by the instruction, but the compiler
 does not, because the %1 operand is not a SET_DEST operand.  Your
 SET_DEST is (const_int 0) which is useless info and incorrect.  You need
 to make sure that the RTL is an accurate description of what the
 instruction does.

 Besides the problem with the missing SET_DEST, there is also the problem
 that you are using AND operands for a compare, which won't work.  AND
 and NE are not interchangeable operations.  Consider what happens if you
 compare 0x1 with 0x1.  cmp.ne returns false.  However, AND returns 0x1,
 which when truncated from DImode to BImode is still 0x1, i.e. true.  So
 the RTL does not perform the same operation as the instruction you
 emitted.  This could confuse the optimizer.

 GCC internals assume that predicate registers are always allocated in
 pairs, and that the second one is always the inverse of the first one.
 Defining a special pattern that only modifies one predicate register
 probably isn't gaining you much.  If you are doing this before register
 allocation, then you are still using 2 predicate registers, as the
 register allocator will always give you 2 even if you only use one.
 Worst case, if this pattern is exposed to the optimizer, then the
 optimizer may make changes that break your assumptions.  It might
 simplify a following instruction by using the second predicate reg for
 instance, which then fails at run-time because you didn't actually set
 the second predicate reg.  If you are only using this in sequences that
 the optimizer can't rewrite, then you should be OK.
 --
 Jim Wilson, GNU Tools Support, http://www.specifix.com


Thanks so much for your helpful hints. I think I need to write more
details about why I need this kind of instruction. But before that,
there is another problem on the liveness calculation (this problem
occurs when I use my new GCC to compile some source with no
optimization flag).

Roughly speaking, my work is to instrument sensitive instructions to
do information flow tracking. I did this work after register
allocation and just before the second scheduling phase (As I need to
intercept all memory access, so I choose to do this work after
register allocation). Instead of reserving certain number of registers
in backend to do instrumentation, I choose to allocate registers for
it. As the register allocation is done, I need compute liveness
information manually for each sensitive insn to get a set of registers
that I can use without any save or restore. To do this, I borrow code
from the function
propagte_block
which is defined in flow.c, more specifically, the code is:
struct propagate_block_info *pbi;

int changed, flags;
  rtx insn, prev;

  bitmap insn_live_in, insn_live_out;
  bitmap bb_live_in, bb_live_out;
  basic_block cur_bb;


 flags = PROP_DEATH_NOTES;
  flags = ~(PROP_SCAN_DEAD_CODE | PROP_SCAN_DEAD_STORES
 | PROP_KILL_DEAD_CODE);

  insn_live_in = BITMAP_ALLOC(NULL);
  insn_live_out = BITMAP_ALLOC(NULL);

  FOR_EACH_BB (cur_bb)
{
  bb_live_in = cur_bb-il.rtl-global_live_at_start;
  bb_live_out = cur_bb-il.rtl-global_live_at_end;

  bitmap_copy (insn_live_out, bb_live_out);
  bitmap_copy (insn_live_in, insn_live_out);

  pbi = init_propagate_block_info (cur_bb, insn_live_in, NULL,
NULL, flags);

  if (flags  PROP_REG_INFO)
  {
unsigned i;
reg_set_iterator rsi;

/* Process the regs live at the end of the block.
   Mark them as not local to any one basic block.  */
EXECUTE_IF_SET_IN_REG_SET (insn_live_in, 0, i, rsi)
  REG_BASIC_BLOCK (i) = REG_BLOCK_GLOBAL;
  }

  changed = 0;

  for (insn = BB_END (cur_bb); ; insn = prev)
  {
bitmap_clear (shift_usable);

/* If this is a call to `setjmp' et al, warn if any
   non-volatile datum is live.  */
if ((flags  PROP_REG_INFO)
 CALL_P (insn)
 find_reg_note (insn, 

support tnat instruction on IA-64. error occurs in bundling. help

2007-09-25 Thread
Hi
I am working on IA-64 and GCC-4.1.1
I modify ia64.md to support tnat instruction. More specifically, I add
the following define_insn:

(define_insn shift_tnat
  [(set (match_operand:BI 0 register_operand =c)
(unspec:BI [(match_operand:DI 1 gr_register_operand r)]
   UNSPEC_TNAT))]
  
  tnat.nz %0, %I0 = %1
  [(set_attr itanium_class tnat)])

add one line in
define_attr type unknown,A,I,M,F,B,L,X,S
thus:
;; chk_s has an I and an M form; use type A for convenience.
(define_attr type unknown,A,I,M,F,B,L,X,S
  (cond [(eq_attr itanium_class ld,st,fld,fldp,stf,sem,nop_m)
(const_string M)
 (eq_attr itanium_class rse_m,syst_m,syst_m0) (const_string 
M)
 (eq_attr itanium_class frar_m,toar_m,frfr,tofr) 
(const_string M)
 (eq_attr itanium_class lfetch) (const_string M)
 (eq_attr itanium_class chk_s,ialu,icmp,ilog,mmalua) 
(const_string A)
 (eq_attr itanium_class fmisc,fmac,fcmp,xmpy) (const_string 
F)
 (eq_attr itanium_class fcvtfx,nop_f) (const_string F)
 (eq_attr itanium_class tnat) (const_string I)
~~~ tnat instruction is 
emit
on Integer unit.
 (eq_attr itanium_class frar_i,toar_i,frbr,tobr) 
(const_string I)
 (eq_attr itanium_class frpr,topr,ishf,xtd,tbit) 
(const_string I)
 (eq_attr itanium_class mmmul,mmshf,mmshfi,nop_i) 
(const_string I)
 (eq_attr itanium_class br,scall,nop_b) (const_string B)
 (eq_attr itanium_class stop_bit) (const_string S)
 (eq_attr itanium_class nop_x) (const_string X)
 (eq_attr itanium_class long_i) (const_string L)]
(const_string unknown)))

and one value in attribute itanium_class
thus:
(define_attr itanium_class 
unknown,ignore,stop_bit,br,fcmp,fcvtfx,fld,

fldp,fmac,fmisc,frar_i,frar_m,frbr,frfr,frpr,ialu,icmp,ilog,ishf,

ld,chk_s,tnat,long_i,mmalua,mmmul,mmshf,mmshfi,rse_m,scall,sem,stf,

st,syst_m0, 
syst_m,tbit,toar_i,toar_m,tobr,tofr,topr,xmpy,xtd,nop,
nop_b,nop_f,nop_i,nop_m,nop_x,lfetch,pre_cycle
  (const_string unknown))

I also modify ia64.c to support UNSPEC_TNAT, in function rtx_needs_barrier.
case UNSPEC_FR_SPILL:
case UNSPEC_FR_RESTORE:
case UNSPEC_GETF_EXP:
case UNSPEC_SETF_EXP:
case UNSPEC_ADDP4:
case UNSPEC_FR_SQRT_RECIP_APPROX:
 case UNSPEC_TNAT:   /* support tnat instruction */

if(XINT(x, 1) == UNSPEC_TNAT)
   {
   print_rtl_single (stderr, x);
   fflush(stderr);
   }

  need_barrier = rtx_needs_barrier (XVECEXP (x, 0, 0), flags, pred);
  break;
However, when I use the new GCC to compile the following function

long  ga[20] = {0, };
intgb[20] = {0, };
char gc[20] = {0, };
shortgd[20] = {0, };

void test_leaf_function()
{
  fprintf(stderr, in function test_leaf_function\n);

  if(ga[0] != 0)
  {
ga[0] = 20;
ga[0] = gd[1];
  }

  ga[0] = 100;
  ga[0] = 0;

  if (gb[0] != 5)
ga[0] = gb[0];
  else
ga[0] = gb[1];

  ga[3] = ga[2]+gc[1];

  gc[0] = 0;
  gd[0] = 0;
}


it reports the error:
error insn:
(insn 163 185 312 0 giftlib_test.c:90 (set (reg:BI 263 p7)
(unspec:BI [
(reg/f:DI 14 r14 [376])
] 32)) 300 {shift_tnat} (insn_list:REG_DEP_ANTI 187
(insn_list:REG_DEP_ANTI 178 (insn_list:REG_DEP_ANTI 179
(insn_list:REG_DEP_ANTI 181 (insn_list:REG_DEP_ANTI 186
(insn_list:REG_DEP_OUTPUT 176 (insn_list:REG_DEP_TRUE 77 (nil
(nil))
giftlib_test.c: In function  foo
giftlib_test.c:99: internal compiler error: in bundling, at
config/ia64/ia64.c:7457
Please submit a full bug report,
with preprocessed source if appropriate.
See URL:http://gcc.gnu.org/bugs.html for instructions.

I follow the error at ia64.c: 7457,  a assertion fails
/* Move the position backward in the window.  Group barrier has
 no slot.  Asm insn takes all bundle.  */
  if (INSN_CODE (insn) != CODE_FOR_insn_group_barrier
   GET_CODE (PATTERN (insn)) != ASM_INPUT
   asm_noperands (PATTERN (insn))  0)
pos--;
  /* Long insn takes 2 slots.  */
  if 

Re: support tnat instruction on IA-64. error occurs in bundling. help

2007-09-25 Thread
2007/9/26, Jim Wilson [EMAIL PROTECTED]:
 ÎâêØ wrote:
  [(set_attr itanium_class tnat)])

 The itanium_class names are based on info from the Itanium Processor
 Microprocessor Reference by the way.

 I believe the problem is that you didn't add info to the DFA scheduler
 dscriptions in the itanium1.md and itanium2.md files for this new
 instruction class.  Normally the DFA scheduler info is optional.
 However, for itanium, we also use the scheduler for bundling, and hence
 proper DFA scheduler info for each instruction class is required.

 It appears that the tnat instruction schedules and bundles the same as
 the tbit instruction, so just use the existing tbit class instead of
 trying to add a new one.  The docs are a bit unclear here though, since
 some places mention tbit and tnat, and other places just mention tbit.
 For your purposes, this isn't important.

 Modifying the DFA scheduler descriptions is complicated.  It is best to
 avoid that if you can.

 Specifying that tnat is an I type instruction isn't enough for bundling
 purposes, since a lot of instructions have further restrictions.  In
 this case, for instance, tnat can only go into an I0 slot, not an I1
 slot.  This detail is handled in the DFA scheduler descriptions.
 --
 Jim Wilson, GNU Tools Support, http://www.specifix.com



Truly thanks, I have discovered this problem after I sent the first
mail, and I found itanium1.md and itanium2.md describe the pipeline
hazard, but they are really complex... :-(. Is there any guide or docs
on this? thanks

However, I have adjusted tnat class to tbit, and it seems working now.

Thanks again


support single predicate set instructions in GCC-4.1.1

2007-09-22 Thread
Hi.

I am working on Itanium architecture and GCC-4.1.1.

I modify the machine description file ia64.md to support single
predicate set  instruction such as:

 (%0) cmp.ne %1, p0 = %2, %3

here %0 and %1 are predicates, %2 is a register or immediate, %3 is a
register operand.

more specifically, I add the following define_insn:

(define_insn *shift_predicate_cmp
  [(set (const_int 0)
(and:BI (and:BI (match_operand:BI 1 register_operand c)
(and:BI (match_operand:DI 2 gr_reg_or_8bit_adjusted_operand rL)
   (match_operand:DI 3 gr_register_operand r)))
(match_operand:BI 0 register_operand c)))]
  
  (%0) cmp.ne %1, p0 = %2, %3
  [(set_attr itanium_class icmp)])

I make this define_insn anonymous because I only need this type of
instruction sometimes, and I can generate this pattern manually; the
generation function is:

rtx gen_shift_predicate_cmp (rtx op0, rtx op1, rtx op2, rtx op3)
{
  return gen_rtx_SET(BImode, CONST0_RTX(BImode),
 gen_rtx_AND(BImode, gen_rtx_AND(BImode, op1,

gen_rtx_AND(BImode, op2, op3)),
 op0));
}

After adding these support, I recompile gcc and insert some
instructions of this kind into insn list, the generation and matching
works fine. BUT, the generation of insn group barrier (';;' on Itanium
architecture) doesn't work fine, and it generates code like:

.loc 1 118 0
(p0) cmp.ne p15, p0 = 0, r30
.loc 1 121 0
(p0) cmp.ne p15, p0 = 0, r30

it warns WAW and there should be stop ;; between these two
instructions. Intuitively, I think I should modify some part of GCC to
generate correct ;; as there is a new type of define_insn. But I
don't know where exactly to do this modification to correct the error
:-(, any help ?

Any help is truely appreciated !

Thanks very much


About allocating registers for instrumentation

2007-09-03 Thread
Hi, I am working on gcc-4.1.1 and Itanium architecture. Current now I
have finished instrumenting ld and st instructions before the second
scheduling pass by reserving two global registers at backend. However,
in order to enhance the performance (e.g. make the scheduling better),
I choose to allocate two registers for each instrumentation instead of
using the reserved ones. To identify which registers I can use for
each ld and st instruction, I follow the following idea:

For each insn, I compute its live-in and live-out by starting from the
basic-block:
as we can get the live-in of the basic-block, then, for INSN(N) in the
basic-block,
  (1) live-in[ INSN(N) ] = live-out [ INSN(N-1) ]
  (2) live-out[ INSN(N) ] = (live-in [ INSN(N) ] U set)
  -(REG_DEAD U REG_UNUSED)

where set is the set of registers set by the insn, and REG_DEAD,
REG_UNUSED can be got from the insn notes.

Then, R-( live-in[INSN(N)] U live-out[INSN(N)] ) is the set of
registers I can use to instrument INSN(N). (here R is a set of
registers I specified, for example, all the caller-save global general
registers)

Am I right? or is there any thing I mis-understand, if any, please
point out, thanks!

Further, how to identify SET in (1) ? I have found many of the insns
just before the second scheduling have only one set in it. If this is
hold for all insns, I think I can use the single_set to get SET. Is
there any exception for that? thanks again

Wu


Re: How to make use of instruction scheduling to improve performance?

2007-07-29 Thread
2007/7/29, 吴曦 [EMAIL PROTECTED]:
 28 Jul 2007 12:16:51 -0700, Ian Lance Taylor [EMAIL PROTECTED]:
  吴曦 [EMAIL PROTECTED] writes:
 
   28 Jul 2007 09:04:01 -0700, Ian Lance Taylor [EMAIL PROTECTED]:
吴曦 [EMAIL PROTECTED] writes:
   
 there are some questions after I read the source code today.
 1st. if I add the instrumentation before 2nd scheduling; will gcc emit
 an insn which will be output as a ld instruction later? If this could
 happen, some ld instruction may not be instrumented...
   
No, gcc won't introduce any new memory load or store instructions
after the prologue and epilogue instructions are threaded.  It may
   ~~~
   when are prologue and epilogue instructions threaded? (after register
   allocation? besides, what is the exact meaning of prologue and
   epilogue instructions are threaded? Would you mind explaining in more
   detail? thx :-))
 
  If you look in gcc/passes.c you will see the list of passes.  The
  prologue and epilogue instructions are threaded in
  pass_thread_prologue_and_epilogue.  This happens after register
 ~
 Sorry, I didn't find that pass in gcc 4.1.1. This pass is added in the
 newest gcc?
 thx.

  allocation.  It means that the prologue and epilogue instructions are
 ~~
 As you have indicated, this pass happens after register allocation, I
 want to allocate register rather than dedicating register to do the
 instrumentation calculation, are there any hints to do this?

  added to the RTL, so that the second scheduling pass can see them.
 
still move them around or eliminate them, though.
   ~~
   emmm, I need to move/remove my instrumentation if necessary...
 
  Yes.  This is true by definition, since you want to instrument before
  the second scheduling pass.  The scheduler can and will move load and
  store instructions.  You need to set up the dependencies so that your
  instrumentation will still occur at the right time.
 
 2nd. to identify ld/st instruction (memory access op), I want to
 modify gen_rtx_SET, the method is that, if I find SRC or DST is an
 memory operand in gen_rtx_SET, then add instrumentation code before
 and after the insn to emit. Will this method work? Besides, if some
 false positives occur, how to correct them (I don't have some very
 clear idea.)
   
Modifying gen_rtx_SET is probably not the right way to go.  That is
   ~
   Then, what about modifying machine description file? Add define_expand
   for the define_insn which will output ld/st instruction (this
   define_expand can insert instrumentation insns. Of course, I need to
   identify the operands to the define_expand contains a memory operand
   and a reg operand.)
 
  That will work in some sense, but if a load or store instruction is
  eliminated you are quite likely to still have the instrumentation
  instructions lying around.
 
  Ian
 
 Thanks for your hints.

rest_of_handle_flow2 calls thread_prologue_and_epilogue_insns, maybe I
need to move to a newer version of gcc


Re: How to make use of instruction scheduling to improve performance?

2007-07-28 Thread
2007/7/28, Ramana Radhakrishnan [EMAIL PROTECTED]:
 Hi,


 On 7/28/07, 吴曦 [EMAIL PROTECTED] wrote:
  I am working on gcc 4.1.1 and itanium2 architecture. I instrumented
  each ld and st instruction in final_scan_insn() by looking at the 
  insn
  template (These instrumentations are used to do some security 
  checks).
  These instrumentations incur high performance overhead when running
  specint benchmarks. However, these instrumentations contain high
  dependencies between instructions so that I want to use instruction
  scheduling to improve the performance.
  In the current implementation, the instrumentations are emitted 
  as
  assembly instructions (not insns). What should I do to make use of 
  the
  instruction scheduler?

 If I understand your description, you are adding instrumentation code,
 and you want to expose that code to the scheduler.  What you need to
 do in that case is to add the code as RTL instructions before the
 scheduling pass runs.  You will need to figure out the RTL which will
 do what you want.  Then you will need to insert it around the
   
 instructions which you want to instrument.  You will probably want to
~
Before the second scheduling pass, how to identify that one insn will
be output as a load instruction (or store instruction)? In the final,
i use get_insn_template() to do this matching. Can I use the same
method before the second scheduling pass? If not, would you mind
giving some hints? thx
  
   Please send followups to the mailing list, not just to me.  Thanks.
  
   You should just match on the RTL.  I don't know enough about the
   Itanium to tell you precisely what to look for.  But, for example, you
   might look for
  s = single_set (PATTERN (insn));
  if (s != NULL  (MEM_P (SET_SRC (s) || MEM_P (SET_DEST (s)
...
  
   Ian
  
 
  Thanks. I observe that the 2nd instruction scheduling happens after
  the local and global allocation. However, in my instrumentation, I
  need several registers to do computation, can I allocate registers to
  do computation in the instrumentation code just before the 2nd
  instruction scheduling? If so, would you mind giving some hints on the
  interfaces that I could make use of.

 Generally you should be able to create new temporaries for such
 calculations before register allocation / reload . Otherwise you might
 have to resort to reserving a couple of registers in your ABI for such
 computations if you wanted these generated after reload (you could
 have a split that did that after reload but where in the function do
 you want to insert the instrumentation code ?)

 From what you are indicating - there isn't enough detail about where
~
 in the function body you are inserting such instrumentation code  -

thx, As I have in indicated, I want to add instrumentations for each
ld and st instruction in one function on itanium. (In my current
implementation, I also instrument cmp and mv instructions on itanium).
for example, for a ld instruction in the original program:
 ld rX=[rY]
I want to instrument it as
 instrumentation prologue
 ld rX=[rY]
 instrumentation epilogue
currently, to identify such ld instruction, I put my instrumentation
in final, and use get_insn_template() to see what instruction this
insn will be output as.

To summarize, as I want to expose my instrumentation to instruction
scheduling, following work should be done:
   1. identify that one insn will be output as a
ld instruction
   2. allocate register to do the instrumentation
calculation (in my current implementation, I use dedicated register to
do this.)
   3. emit the prepared instrumentation insn

 If you are doing such instrumentation in the prologue or epilogue of a
 function, you could choose to use gen_reg_rtx to obtain a temporary
 register.

 So typically obtain a temporary register in the following manner
  rtx tmp_reg = gen_reg_rtx (machinemode);

 Use the tmp_reg in whatever instruction you want to generate using the
 corresponding register as one of the operands .  For these you might
 want to use the corresponding gen_*** named functions .

 cheers
 Ramana








 Besides,  what happens if I move the insertion of instrumentation
  before register allocation,  or even before the 1st scheduling pass,
  can I identify load/store instructions that early?
 


 --
 Ramana Radhakrishnan


Thanks for your hints.


Re: How to make use of instruction scheduling to improve performance?

2007-07-28 Thread
2007/7/28, 吴曦 [EMAIL PROTECTED]:
 2007/7/28, Ramana Radhakrishnan [EMAIL PROTECTED]:
  Hi,
 
 
  On 7/28/07, 吴曦 [EMAIL PROTECTED] wrote:
   I am working on gcc 4.1.1 and itanium2 architecture. I 
   instrumented
   each ld and st instruction in final_scan_insn() by looking at the 
   insn
   template (These instrumentations are used to do some security 
   checks).
   These instrumentations incur high performance overhead when 
   running
   specint benchmarks. However, these instrumentations contain high
   dependencies between instructions so that I want to use 
   instruction
   scheduling to improve the performance.
   In the current implementation, the instrumentations are 
   emitted as
   assembly instructions (not insns). What should I do to make use 
   of the
   instruction scheduler?
 
  If I understand your description, you are adding instrumentation 
  code,
  and you want to expose that code to the scheduler.  What you need to
  do in that case is to add the code as RTL instructions before the
  scheduling pass runs.  You will need to figure out the RTL which 
  will
  do what you want.  Then you will need to insert it around the

  instructions which you want to instrument.  You will probably want 
  to
 ~
 Before the second scheduling pass, how to identify that one insn will
 be output as a load instruction (or store instruction)? In the final,
 i use get_insn_template() to do this matching. Can I use the same
 method before the second scheduling pass? If not, would you mind
 giving some hints? thx
   
Please send followups to the mailing list, not just to me.  Thanks.
   
You should just match on the RTL.  I don't know enough about the
Itanium to tell you precisely what to look for.  But, for example, you
might look for
   s = single_set (PATTERN (insn));
   if (s != NULL  (MEM_P (SET_SRC (s) || MEM_P (SET_DEST (s)
 ...
   
Ian
   
  
   Thanks. I observe that the 2nd instruction scheduling happens after
   the local and global allocation. However, in my instrumentation, I
   need several registers to do computation, can I allocate registers to
   do computation in the instrumentation code just before the 2nd
   instruction scheduling? If so, would you mind giving some hints on the
   interfaces that I could make use of.
 
  Generally you should be able to create new temporaries for such
  calculations before register allocation / reload . Otherwise you might
  have to resort to reserving a couple of registers in your ABI for such
  computations if you wanted these generated after reload (you could
  have a split that did that after reload but where in the function do
  you want to insert the instrumentation code ?)
 
  From what you are indicating - there isn't enough detail about where
 ~
  in the function body you are inserting such instrumentation code  -
 
 thx, As I have in indicated, I want to add instrumentations for each
 ld and st instruction in one function on itanium. (In my current
 implementation, I also instrument cmp and mv instructions on itanium).
 for example, for a ld instruction in the original program:
 ld rX=[rY]
 I want to instrument it as
 instrumentation prologue
 ld rX=[rY]
 instrumentation epilogue
 currently, to identify such ld instruction, I put my instrumentation
 in final, and use get_insn_template() to see what instruction this
 insn will be output as.

 To summarize, as I want to expose my instrumentation to instruction
 scheduling, following work should be done:
   1. identify that one insn will be output as a
 ld instruction
   2. allocate register to do the instrumentation
 calculation (in my current implementation, I use dedicated register to
 do this.)
   3. emit the prepared instrumentation insn
 
  If you are doing such instrumentation in the prologue or epilogue of a
  function, you could choose to use gen_reg_rtx to obtain a temporary
  register.
 
  So typically obtain a temporary register in the following manner
   rtx tmp_reg = gen_reg_rtx (machinemode);
 
  Use the tmp_reg in whatever instruction you want to generate using the
  corresponding register as one of the operands .  For these you might
  want to use the corresponding gen_*** named functions .
 
  cheers
  Ramana
 
 
 
 
 
 
 
 
  Besides,  what happens if I move the insertion of instrumentation
   before register allocation,  or even before the 1st scheduling pass,
   can I identify load/store instructions that early?
  
 
 
  --
  Ramana

Re: How to make use of instruction scheduling to improve performance?

2007-07-28 Thread
28 Jul 2007 09:04:01 -0700, Ian Lance Taylor [EMAIL PROTECTED]:
 吴曦 [EMAIL PROTECTED] writes:

  there are some questions after I read the source code today.
  1st. if I add the instrumentation before 2nd scheduling; will gcc emit
  an insn which will be output as a ld instruction later? If this could
  happen, some ld instruction may not be instrumented...

 No, gcc won't introduce any new memory load or store instructions
 after the prologue and epilogue instructions are threaded.  It may
~~~
when are prologue and epilogue instructions threaded? (after register
allocation? besides, what is the exact meaning of prologue and
epilogue instructions are threaded? Would you mind explaining in more
detail? thx :-))

 still move them around or eliminate them, though.
~~
emmm, I need to move/remove my instrumentation if necessary...


  2nd. to identify ld/st instruction (memory access op), I want to
  modify gen_rtx_SET, the method is that, if I find SRC or DST is an
  memory operand in gen_rtx_SET, then add instrumentation code before
  and after the insn to emit. Will this method work? Besides, if some
  false positives occur, how to correct them (I don't have some very
  clear idea.)

 Modifying gen_rtx_SET is probably not the right way to go.  That is
~
Then, what about modifying machine description file? Add define_expand
for the define_insn which will output ld/st instruction (this
define_expand can insert instrumentation insns. Of course, I need to
identify the operands to the define_expand contains a memory operand
and a reg operand.)

 used in many places throughout the RTL passes.  Not all of those
 places are going to be able to cope with the new instructions you want
 to add.

 Ian


Thanks for your hints again :-)


How to make use of instruction scheduling to improve performance?

2007-07-27 Thread
I am working on gcc 4.1.1 and itanium2 architecture. I instrumented
each ld and st instruction in final_scan_insn() by looking at the insn
template (These instrumentations are used to do some security checks).
These instrumentations incur high performance overhead when running
specint benchmarks. However, these instrumentations contain high
dependencies between instructions so that I want to use instruction
scheduling to improve the performance.
In the current implementation, the instrumentations are emitted as
assembly instructions (not insns). What should I do to make use of the
instruction scheduler?

Any help is truely appreciated!
3x


Re: How to make use of instruction scheduling to improve performance?

2007-07-27 Thread
I am working on gcc 4.1.1 and itanium2 architecture. I instrumented
each ld and st instruction in final_scan_insn() by looking at the insn
template (These instrumentations are used to do some security checks).
These instrumentations incur high performance overhead when running
specint benchmarks. However, these instrumentations contain high
dependencies between instructions so that I want to use instruction
scheduling to improve the performance.
In the current implementation, the instrumentations are emitted as
assembly instructions (not insns). What should I do to make use of the
instruction scheduler?
  
   If I understand your description, you are adding instrumentation code,
   and you want to expose that code to the scheduler.  What you need to
   do in that case is to add the code as RTL instructions before the
   scheduling pass runs.  You will need to figure out the RTL which will
   do what you want.  Then you will need to insert it around the
 
   instructions which you want to instrument.  You will probably want to
  ~
  Before the second scheduling pass, how to identify that one insn will
  be output as a load instruction (or store instruction)? In the final,
  i use get_insn_template() to do this matching. Can I use the same
  method before the second scheduling pass? If not, would you mind
  giving some hints? thx

 Please send followups to the mailing list, not just to me.  Thanks.

 You should just match on the RTL.  I don't know enough about the
 Itanium to tell you precisely what to look for.  But, for example, you
 might look for
s = single_set (PATTERN (insn));
if (s != NULL  (MEM_P (SET_SRC (s) || MEM_P (SET_DEST (s)
  ...

 Ian


Thanks. I observe that the 2nd instruction scheduling happens after
the local and global allocation. However, in my instrumentation, I
need several registers to do computation, can I allocate registers to
do computation in the instrumentation code just before the 2nd
instruction scheduling? If so, would you mind giving some hints on the
interfaces that I could make use of.
   Besides,  what happens if I move the insertion of instrumentation
before register allocation,  or even before the 1st scheduling pass,
can I identify load/store instructions that early?


Any hints on this problem? Thanks!

2007-02-09 Thread

Hi,
I am working on gcc-4.1.1 and Itanium architecure. Today I try to add
a function call before each ld instruction. The method I use to
achieve this goal is to modify final_scan_insn() in final.c: before
calling get_insn_template, I add codes to check whether the insn
matches a template that will emit ld instruction, then I use
emit_library_call to emit new insns and output them by calling
final_scan_insn() again. Now,the modified gcc is successfully builded
and when I use it to compile a program, I observe that it successfully
intercept each ld instruction and add the desired function call before
them.
But the problem comes, when I run the modified program compiled by
the hacked gcc, it crashes due to segment fault. I use gdb to debug
the program, and observe that the fault is due to this:  originally,
what I want to do is ld r14=[r14], and r14 contains the correct
address, but in my inserted function call, say FOO, it modifies r14 to
0, and when the program returns from FOO and load from r14 again, it
crashes, undoubtedly. Here is a concrete example, just a very simple
one to illstrate the situation:

~~~
old code:
main:
...
ld r14=[r14]
...
~~~

~~~
new code:
FOO:
...
mov r14=0
...

main:
...
br.call FOO
ld r14=[r14]/* CRASH! */
...
~~~
Now, my question becomes clear. How to make my inserted function call
not affect the orginal state of program? Further more, if I add more
instructions (not only a function call), how can I keep the that
state? Is there a general way to do this?
Any hints on this problem will be *truely* appreciated. Thanks!

Best Regards

Andy.Wu


Re: Any hints on this problem? Thanks!

2007-02-09 Thread


Make sure that the called function restores the original state of the
program before it returns.

Andreas.

Thanks~. I know the goal is to restore the original state before the
inserted function returns. BUT, how to? Is there any way to tell gcc:
Hey, you should restore the original state before that function
returns. I want hints on how to and  the existing  interfaces in gcc
to do this :-).


Re: Any hints on this problem? Thanks!

2007-02-09 Thread

Another solution is to add the instrumentation earlier, and use expand_call.


Thanks for your hints. Is that means doing intrumentation at the RTL
expand level? However, I have tried the following method, add a
defined_expand  in ia64.md, the template used in define_expand is the
same as the one which will emit a ld instruction, just like this one:

~~
;; expand the ld operation with check code if user turns on
;; fld-checking
(define_expand gift_load_symptr_low
 [(set (match_operand:DI 0 register_operand =r)
(lo_sum:DI (match_operand:DI 1 register_operand r)
   (match_operand 2 got_symbolic_operand s)))]
 
 {
   if(flag_ld_checking)
   {
 printf(gift_load_symptr_low emits checking function call\n);
 emit_library_call(gen_rtx_SYMBOL_REF(Pmode,
\gift_check_bitmap\), 0,   VOIDmode, 0);
 emit_insn (gen_rtx_SET (VOIDmode,
operands[0],
gen_rtx_LO_SUM (DImode,
operands[1],
operands[2])));
   }
   DONE;
 })
~~
BUT, when I use the newly builded compiler to compile my program,
nothing matched to expand such ld instruction ...


Re: error: unable to generate reloads for..., any hints?

2007-02-08 Thread

Thanks. But what does it mean by saying:
Sometimes an insn can match more than one instruction pattern. Then
the pattern that appears first in the machine description is the one
used.
in section 14.10 of  gcc internal p259?
08 Feb 2007 00:09:21 -0800, Ian Lance Taylor [EMAIL PROTECTED]:

吴曦 [EMAIL PROTECTED] writes:

 I observe that there is a ld instruction in 3rd alternative, so I add
 a new define_insn before it in the hope that it will be matched
 firstly.

It doesn't work that way.  Your new instruction will wind up matching
all move instructions.  Reload will crash because the constraints
don't work.

Instead just change the existing movqi_internal insn.  Don't try to
write a new one.  Change the existing insn to use C code which checks
which_alternative instead of the @ list it uses now.

Ian



error: unable to generate reloads for..., any hints?

2007-02-07 Thread

Hi,
I am working on gcc 4.1.1 and Itanium architecture. I want to modify
the machine description of ia64.md to add some checks before each ld
instruction. the
following is the original define_insn:

(define_insn *movqi_internal
 [(set (match_operand:QI 0 destination_operand =r,r,r, m, r,*f,*f)
 (match_operand:QI 1 move_operandrO,J,m,rO,*f,rO,*f))]
 ia64_move_ok (operands[0], operands[1])
 @
  mov %0 = %r1
  addl %0 = %1, r0
  ld1%O1 %0 = %1%P1
  st1%Q0 %0 = %r1%P0
  getf.sig %0 = %1
  setf.sig %0 = %r1
  mov %0 = %1
  [(set_attr itanium_class ialu,ialu,ld,st,frfr,tofr,fmisc)])

I observe that there is a ld instruction in 3rd alternative, so I add
a new define_insn before it in the hope that it will be matched
firstly.

(define_insn *ld_movqi_internal
 [(set (match_operand:QI 0 destination_operand =r)
 (match_operand:QI 1 move_operand m))]
 ia64_move_ok (operands[0], operands[1])
   flag_check_ld
  {
printf(define_insn ld_movqi_internal\n);
return ld1%O1 %0 = %1%P1;
  }
  [(set_attr itanium_class ld)]

I keep every thing the same as 3rd alternative in original define_insn
except using C statement to return the desired output template.
However, when I use the newly builded gcc to compile the following
program, it crashes.

#include stdio.h

char characters[8192]={'a',};

int main()
{
char c = characters[0];
printf(Hello World! c:%c\n, c);
}

the error reported is:
hi.c:9: error: unable to generate reloads for:
(insn 10 9 12 1 (set (mem/c/i:QI (reg/f:DI 111 loc79) [0 c+0 S1 A128])
   (reg:QI 14 r14 [orig:342 characters ] [342])) 3
{*gift_movqi_internal_ld} (nil)
   (expr_list:REG_DEAD (reg:QI 14 r14 [orig:342 characters ] [342])
   (nil)))
hi.c:9: internal compiler error: in find_reloads, at reload.c:3738

In IA64, the first pesudo register number is 334, thus register 111
and register 14 are both hardware registers.

I looked at find_reloads at reload.c and find the following code
fragement and comment:

 /* The operands don't meet the constraints.
goal_alternative describes the alternative
that we could reach by reloading the fewest operands.
Reload so as to fit it.  */

 if (best == MAX_RECOG_OPERANDS * 2 + 600)
   {
 /* No alternative works with reloads??  */
 if (insn_code_number = 0)
   fatal_insn (unable to generate reloads for:, insn);
 ...

So, what is going on here? Especially, what is find_reloads going to
finish and why it is going wrong here...

I would appreciate any help on this question, thx!

Best Regards

--andy.wu


Re: Some hints on solving this problem?

2007-02-04 Thread

Thanks for the hints, I have already noticed that insn list is match
against the RTL templates to emit assembly code. However,   I found 3 md
files, ia64.md, itanium.md, itanium2.md, each file is very big...
would you mind giving some hints on the differences between them?
Especially, I am working on itanium2 architecture. (I have found many
insn templates in ia64.md, but how about
itanium.md and itanium2.md?)

在 07-2-4,Paul Yuan[EMAIL PROTECTED] 写道:

1) Modify the final() in final.c to emit some code before ld and st
before outputting the assembly.
2) Modify the MD file. Find the template which generate ld or st, and
add some code before ld  and st.

On 2/3/07, 吴曦 [EMAIL PROTECTED] wrote:
 Hi,
 I am working on gcc 4.1.1 and Itanium2 architecture. I want to use gcc
 to emit some code before each ld and st instruction (I know that using
 dynamic binary translator like PIN may be more suitable for this task,
 but I am on the way of studying gcc and want to use it to achieve this
 goal). But after several days of study, I find that the back-end of
 gcc too complex... :-(

 So, what is the best level in back-end to accomplish this task?

 I would appreciate any help I can get on this problem!

 thx!



--
Paul Yuan
www.yingbo.com



Some hints on solving this problem?

2007-02-03 Thread

Hi,
I am working on gcc 4.1.1 and Itanium2 architecture. I want to use gcc
to emit some code before each ld and st instruction (I know that using
dynamic binary translator like PIN may be more suitable for this task,
but I am on the way of studying gcc and want to use it to achieve this
goal). But after several days of study, I find that the back-end of
gcc too complex... :-(

So, what is the best level in back-end to accomplish this task?

I would appreciate any help I can get on this problem!

thx!


Level to do such a modification...

2007-01-23 Thread

Hi,
I am working on gcc 4.0.0. I want to use gcc to intercept each call to
read, and taint the data readed in. For example:
transform
read(fd, buf, size)
to
read(fd, buf, size)
if(is_socket(fd))
taint(buf, size)
So, what is the best suitable level to do this modification in gcc? My
own thought is in finish_function, before calling c_genericize,as I
discovered that in c front-end, there's no GENERIC tree... In
c_genericize, it directly calls gimplify_function_tree.


Re: Level to do such a modification...

2007-01-23 Thread

I know valgrind, it is an emulator ,but we are restricted not to use
an emulator. :-(

2007/1/24, Nicholas Nethercote [EMAIL PROTECTED]:

On Wed, 24 Jan 2007, [GB2312] ÎâêØ wrote:

 I am working on gcc 4.0.0. I want to use gcc to intercept each call to
 read, and taint the data readed in. For example:
 transform
   read(fd, buf, size)
 to
   read(fd, buf, size)
   if(is_socket(fd))
   taint(buf, size)
 So, what is the best suitable level to do this modification in gcc? My
 own thought is in finish_function, before calling c_genericize,as I
 discovered that in c front-end, there's no GENERIC tree... In
 c_genericize, it directly calls gimplify_function_tree.

Are you sure you want to do this in GCC?  You might find it easier to use a
dynamic binary instrumentation framework such as Valgrind or Pin to do this
kind of thing.

Nick



Re: Level to do such a modification...

2007-01-23 Thread

Anyway, the program is supervised...would you mind giving some advices
with the compiler-based approach, after recompilation, I could finish
this modification.

2007/1/24, Nicholas Nethercote [EMAIL PROTECTED]:

On Wed, 24 Jan 2007, [GB2312] ÎâêØ wrote:

 I know valgrind, it is an emulator ,but we are restricted not to use
 an emulator. :-(

Well, for some definition of emulator.

Nick



passing arguments in emit_libraray_call

2007-01-07 Thread

Hi,
I want to use emit_library_call to output a library call to printf.
The question is how to pass a format string argument?

Also, in the comment of emit_library_call  mentions:

The rtx values should have been passed through protect_from_queue already.

then, what should I do to pass the rtx values through
protect_from_queue? Is there any doc or example to refer to?


Re: passing arguments in emit_libraray_call

2007-01-07 Thread

sorry for that~, I am using gcc3.4.0.
thanks for the hints on passing format string argument~

在 07 Jan 2007 20:25:29 -0800,Ian Lance Taylor[EMAIL PROTECTED] 写道:

吴曦 [EMAIL PROTECTED] writes:

 I want to use emit_library_call to output a library call to printf.
 The question is how to pass a format string argument?

See, e.g., how STRING_CST is handled in expand_expr_real_1.

 Also, in the comment of emit_library_call  mentions:

 The rtx values should have been passed through protect_from_queue already.

 then, what should I do to pass the rtx values through
 protect_from_queue? Is there any doc or example to refer to?

protect_from_queue is no longer used.  On the other hand, you didn't
mention which version you are using, and I don't see that comment in
the current sources.  So if you are using a version which still uses
protect_from_queue, just grep for it in the source code.

Ian



How to dedicate a register for special purpose in gcc?

2007-01-04 Thread

Hi,
How can I dedicate a register for special purpose, that means,
the dedicated register only appears in the inserted code of my own,
but never allocated in the rest of code. I have read some doc(gcc int)
about the register usage but still have no idea.

I would *really* appreciate any help I can get on this issue!
Xi Wu