Re: question on bitmap_set_subtract unction in pre

2012-02-07 Thread Amker.Cheng
On Mon, Feb 6, 2012 at 7:28 PM, Richard Guenther wrote: > It's probably to have the SET in some canonical form - the resulting I am wondering how the canonical form is maintained, since according to the paper: For an antileader set, it does not matter which expression represents a value, as long a

question on bitmap_set_subtract unction in pre

2012-02-05 Thread Amker.Cheng
Hi, In PRE, function compute_antic_aux uses bitmap_set_subtract to compute value/expression set subtraction. The comment of bitmap_set_subtract says it subtracts all the values and expressions contained in ORIG from DEST. But the implementation as following ---

Re: question on inconsistent generated codes for builtin calls

2012-01-15 Thread Amker.Cheng
On Fri, Jan 13, 2012 at 10:17 PM, Amker.Cheng wrote: > On Fri, Jan 13, 2012 at 5:33 PM, Richard Guenther > wrote: >> >> No, I think the check is superfluous and should be removed.  I also wonder >> why we exempt BUILT_IN_FREE here ... can you dig in SVN history a bit?

Re: question on inconsistent generated codes for builtin calls

2012-01-13 Thread Amker.Cheng
On Fri, Jan 13, 2012 at 5:33 PM, Richard Guenther wrote: > > No, I think the check is superfluous and should be removed.  I also wonder > why we exempt BUILT_IN_FREE here ... can you dig in SVN history a bit? > For both things? Thanks for clarifying. I will look into it. -- Best Regards.

question on inconsistent generated codes for builtin calls

2012-01-13 Thread Amker.Cheng
Hi, I noticed gcc generates inconsistent codes for same function for builtin calls. compile following program: -- #include int a(float x) { return sqrtf(x); } int b(float x) { return sqrtf(x); } With command: arm-none-eabi-gcc -mthumb -mhar

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-01-03 Thread Amker.Cheng
On Mon, Jan 2, 2012 at 10:54 PM, Richard Guenther wrote: > Yes.  It won't handle > >  if (x > 1) >   ... >  tem = x > 1; > > or > >  if (x > 1) >   ... >  if (x > 1) > > though maybe we could teach PRE to do the insertion by properly > putting x > 1 into EXP_GEN in compute_avail (but not into AVA

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-01-02 Thread Amker.Cheng
On Mon, Jan 2, 2012 at 9:37 PM, Richard Guenther wrote: > Well, with > > Index: gcc/tree-ssa-pre.c > === > --- gcc/tree-ssa-pre.c  (revision 182784) > +++ gcc/tree-ssa-pre.c  (working copy) > @@ -4335,16 +4335,23 @@ eliminate (void)

Re: RFC: Handle conditional expression in sccvn/fre/pre

2012-01-02 Thread Amker.Cheng
Thanks Richard, On Mon, Jan 2, 2012 at 8:33 PM, Richard Guenther wrote: > > I've previously worked on changing GIMPLE_COND to no longer embed > the comparison but carry a predicate SSA_NAME only (this is effectively > what you do as pre-processing before SCCVN).  It had some non-trivial > fallout

RFC: Handle conditional expression in sccvn/fre/pre

2012-01-02 Thread Amker.Cheng
Hi, Since SCCVN operates on SSA graph instead of the control flow graph for the sake of efficiency, it does not handle or value number the conditional expression of GIMPLE_COND statement. As a result, FRE/PRE does not simplify conditional expression, as reported in bug 30997. Since it would be com

Re: question on behavior of tree-ssa-ccp

2011-12-15 Thread Amker.Cheng
Forgot the command line: arm-none-eabi-gcc -O2 -mthumb -mcpu=cortex-m3 -S test.c -o test.S -fdump-tree-all gcc is comfigured as arm-non-eabi, but I think it's independent of target. -- Best Regards.

question on behavior of tree-ssa-ccp

2011-12-15 Thread Amker.Cheng
HI, I encountered a case with below codes: int data_0; int motion_test1(int data, int v) { int i; int t, u; int x; if (data) i = data_0 + x; else { v = 2; i = 5; } t = data_0 + x; u = i;

Re: At which pass thing goes wrong for PR43491?

2011-12-06 Thread Amker.Cheng
On Thu, Dec 1, 2011 at 11:45 PM, Richard Guenther wrote: > Well, it's not that easy if you still want to properly do redundant expression > removal on global registers. Yes, it might be complicate to make PRE fully aware of global register. I also found comments in is_gimple_reg which says gcc d

Re: At which pass thing goes wrong for PR43491?

2011-12-01 Thread Amker.Cheng
On Sat, Nov 26, 2011 at 3:41 PM, Amker.Cheng wrote: > Hi, > I looked into PR43491 a while and found in this case the gimple > generated before pre > is like: > > reg.0_12 = reg > ... > c() > reg.0_1 = reg > D.xxx = MEM[reg.0_1 + 8B] > > The pre pass tr

At which pass thing goes wrong for PR43491?

2011-11-25 Thread Amker.Cheng
Hi, I looked into PR43491 a while and found in this case the gimple generated before pre is like: reg.0_12 = reg ... c() reg.0_1 = reg D.xxx = MEM[reg.0_1 + 8B] The pre pass transforms it into: reg.0_12 = reg ... c() reg.0_1 = reg.0_12 D.xxx = MEM[reg.0_1 + 8B] >From now on, following passes(li

Re: missing conditional propagation in cprop.c pass

2011-10-10 Thread Amker.Cheng
Hi Jeff, Steven, I have filed a bug at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50663 Could somebody confirm it? I am studying this piece of codes and have spent some time on it, I'm working on a patch and hoping could help on this issue, Please help me review it later. Thanks. -- Best Regar

Re: missing conditional propagation in cprop.c pass

2011-09-29 Thread Amker.Cheng
>> >> I believe, the optimization you may be referring to is value range >> propagation which does predication of values based on predicates of >> conditions. GCC definitely applies VRP at the tree stage, I am not >> sure if there is an RTL pass to do the same. > There are also RTL optimizers which

Re: missing conditional propagation in cprop.c pass

2011-09-29 Thread Amker.Cheng
> > Nobody mentioned this so I might be way off but cc doesn't get (minus > (reg r684) (const_int 0)). It gets the `condition codes` modification as > a consequence of the subtraction. > Hi Paulo, According to section "comparison operations" in internal: "The comparison operators may be used to co

Re: missing conditional propagation in cprop.c pass

2011-09-29 Thread Amker.Cheng
> Unless there's something arch specific related to arm, insn 882 is a > compare, which won't change r684. Why do you think 0 should > propagated to r291 if r684 is not zero? > Thanks for replying. Sorry if I misunderstood anything below, and please correct me. insn 882 : cc <- compare (

Re: missing conditional propagation in cprop.c pass

2011-09-29 Thread Amker.Cheng
On Tue, Sep 27, 2011 at 4:19 PM, Amker.Cheng wrote: > Hi, > I ran into a case and found conditional (const) propagation is > mishandled in cprop pass. > With following insn sequence after cprop1 pass: > > (note 878

missing conditional propagation in cprop.c pass

2011-09-27 Thread Amker.Cheng
Hi, I ran into a case and found conditional (const) propagation is mishandled in cprop pass. With following insn sequence after cprop1 pass: (note 878 877 880 96 [bb 96] NOTE_INSN_BASIC_BLOCK) (insn 882 881 883 96 (set (reg:CC 24 cc) (co

Re: Question on _GLIBCXX_HOSTED macro libstdc++ and libsupc++

2011-09-23 Thread Amker.Cheng
> (Any reason this wasn't sent to the libstdc++ list?) > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43852 proposes a "quiet > mode" which would reduce code size by disabling some of the code in > eh_term_handler.cc and pure.cc - would that do what you want? > > I've not had time to do anything a

Question on _GLIBCXX_HOSTED macro libstdc++ and libsupc++

2011-09-23 Thread Amker.Cheng
Hi, In libstdc++-v3/libsupc++/eh_term_handler.cc, it says by default the demangler things are pulled in, according to whether _GLIBCXX_HOSTED is defined. the demangler exception terminating handler are really big, especially for embedded system. Secondly, _GLIBCXX_HOSTED is now defined if --enabl

CFLAGS used in libgcc makefile?

2011-09-13 Thread Amker.Cheng
Hi guys, Is it CFLAGS used by libgcc/Makefile.in to build libgcc.a? It seems if I configure gcc with CFLAGS="-O0 -g " environment variable, libgcc is also compiled with -O0 option. I'm wondering why do not use CFLAGS_FOR_TARGET here(CFLAGS->INTERNAL_CFLAGS->gcc_compile_bare->gcc_compile). Please h

question on find_if_case_2 in ifcvt.c

2011-09-08 Thread Amker.Cheng
Hi, In ifcvt.c's function find_if_case_2, it uses cheap_bb_rtx_cost_p to judge the conversion. Function cheap_bb_rtx_cost_p checks whether the total insn_rtx_cost on non-jump insns in basic block BB is less than MAX_COST. So the question is why uses cheap_bb_rtx_cost_p, even when we know the ELSE

question on arm soft-fp function __aeabi_d2uiz

2011-05-07 Thread Amker.Cheng
Hi, I found in gcc/config/arm/ieee754-df.S, the function __aeabi_d2uiz converts double into unsigned integer and the function always return 0 if the double value is negative. for example the following codes: ---sample codes-- unsigned long ul; double d = -1.1

Re: question on ssa representation of aggregates

2010-10-22 Thread Amker.Cheng
> The implementation of this stuff changes fairly regularly.  The people > who like this kind of thing are still honing in on the best way to > handle aliasing information.  Richard Guenther is the main guy working > in this area today. thanks very much for clarification. -- Best Regards.

question on ssa representation of aggregates

2010-10-22 Thread Amker.Cheng
Hi : In paper "Memory SSA-A Unified Approach for Sparsely Representing Memory Operations", section 2.2, it says : "Whenever possible, compiler will create symbolic names to represent distinct regions inside aggregates(called structure field tags or SFT). For instance, in Figure 2(b), GCC will c

Re: question on points-to analysis

2010-09-11 Thread Amker.Cheng
> In theory, this is true, but a lot of the optimizations decrease > accuracy at a cost of making the problem solvable in a reasonable > amount of time. > By performing it after building initial points-to sets, the amount of > accuracy loss is incredibly small. > The only type of constraint that wi

question on points-to analysis

2010-09-09 Thread Amker.Cheng
Hi, I am studying gcc's points-to analysis right now and encountered a question. In paper "Off-line Variable Substitution for Scaling Points-to Analysis", section 3.2 It says that we should not substitute a variable with other if it is taken address. But in GCC's implementation, it units pointer but

A minor mistake in cse_main?

2010-08-17 Thread Amker.Cheng
Hi : In function cse_main, gcc processes ebb path by path. firstly, gcc finds the first bb of path in the reverse post order queue, plus if the bb is still not visited. then gcc finds all paths starting with that first bb. the corresponding code is like: do { bb = BASIC_

why are multiply-accumulate insns not used when -mfp32 on mips

2010-07-20 Thread Amker.Cheng
HI: found mult-acc insns like madd.s/d are only used when -mfp64 is specified, as to codes, there macros defined as: #define ISA_HAS_FP4 ((ISA_MIPS4 \ || (ISA_MIPS32R2 && TARGET_FLOAT64) \ <--only float 64

question about float insns like ceil/floor on mips machine

2010-07-19 Thread Amker.Cheng
Hi: I found although there are standard pattern names such as "ceilm2/floorm2", there is no insn pattern in mips.md for such float insns on mips target. further more, there is no ceil/floor rtl code in rtl.def either. based on these facts, I assuming those float insns are not supported by gcc, b

Re: GCC4.3.4 downside against GCC3.4.4 on mips?

2010-07-11 Thread Amker.Cheng
>>> >>> while GCC3.4.4 treats the long long multiplication just like simple >>> ones, which generates only one >>> mult insn for each statement, like >>> >>> In my understanding, It‘s not necessary using three mult insn to implement >>> long long mult, since the operands are converted from int type

question on function change_loop in IRA

2010-06-22 Thread Amker.Cheng
Hi: At last of function change_loop, gcc try to change ALLOCNO_REG of local allocno. In the loop, ALLOCNO_SOMEWHERE_RENAMED_P (allocno) is set if allocno is not caps. Don't understand why the flag is set here. Doesn't all local allocnos' flag are set in this loop? seems conflicting with function

Re: subreg against register allocation?

2010-06-14 Thread Amker.Cheng
Thanks for explanation. here are three more questions 1 , If I am talking the right thing, there are two insns like "*mulsi3_1" and "*smulsi3_highpart_insn", which set two parts of DImode pseudo regs of DImode mult. Since both parts pf result are used in the original example,

subreg against register allocation?

2010-06-14 Thread Amker.Cheng
Hi : I am studying IRA right now (GCC4.4.1,mips32 target), for following piece of code: long long func(int a, int b) { long long r = (long long)a * (long long)b; return r; } the asm generated on mips is like: mult$5,$4 mfhi$5 mflo$2 j

Re: a typo in ira-emit.c?

2010-06-09 Thread Amker.Cheng
> > Yes, I think it can be NULL in some complicated cases when a loop exit edge > comes not in the parent loop. By that, you mean the case an regno lives on edges which transfer between adjacent loops, and not lives in parent loop? So, the fprintf would access null pointer in this case. Thanks for

a typo in ira-emit.c?

2010-06-09 Thread Amker.Cheng
Hi : I am studying ira right now, there is following code in change_loop if (parent_allocno == NULL || REGNO (ALLOCNO_REG (parent_allocno)) == REGNO (original_reg)) { if (internal_flag_ira_verbose > 3 && ira_dump_file) fprintf (ira_

Re: Puzzle about macro MIPS_PROLOGUE_TEMP_REGNUM

2010-06-06 Thread Amker.Cheng
> > It's not "starting from $3".  It's $3 and nothing else ;-)  It's not > intended to be used as (MIPS_PROLOGUE_TEMP_REGNUM + N). > > $3 was chosen because it's a MIPS16 register, and can therefore > be used for both MIPS16 and normal-mode code.  $2 used to be the > static chain register, which le

Puzzle about macro MIPS_PROLOGUE_TEMP_REGNUM

2010-06-04 Thread Amker.Cheng
Hi : I found the temp register used for saving registers when expanding prologue is defined by macro MIPS_PROLOGUE_TEMP_REGNUM on mips target, like: #define MIPS_PROLOGUE_TEMP_REGNUM \ (cfun->machine->interrupt_handler_p ? K0_REG_NUM : GP_REG_FIRST + 3) I don't understand why using registers

Re: GCC4.3.4 downside against GCC3.4.4 on mips?

2010-05-27 Thread Amker.Cheng
> Posting some random numbers without a test-case and precise command line > parameters for both compilers makes the numbers useless, IMHO. You also > only mention instruction counts. Have you actually benchmarked the > resulting code? CPUs are complicated and what you might perceive as worse > cod

GCC4.3.4 downside against GCC3.4.4 on mips?

2010-05-25 Thread Amker.Cheng
Hi all, I compared assembly files of a function compiled by GCC4.3.4 and GCC3.4.4. The function focuses on array computation and has no branch, or any loop structure, The command line is like "-march=mips32r2 -O3", and here is the instruction statics: total: 1879 : 1534

mips secondary reload question

2010-05-12 Thread Amker.Cheng
Hi: as to page http://gcc.gnu.org/ml/gcc/2010-05/msg00091.html, If the fpu register can not copied to/from memory directly, I have to use intermediate GPR registers. In fact, I return GP_REGS if copying x to a register in class FP_REGS in any mode(including CCmode), this results in infinite recu

Is it safe to use $t0 when handling call clobbered registers (on MIPS)

2010-05-10 Thread Amker.Cheng
Hi : I'm working on a fpu which cannot work fpload insns right, so I have to use a GPR reg as temp reg to first load mem into GPR then move GPR into fpu register. I have handled most cases but the case gcc handling call clobbered fpu registers. since it is in reload pass, I have no available GPR

Re: a peculiar fpload problem on an inferior processor

2010-05-10 Thread Amker.Cheng
On Sat, May 8, 2010 at 2:52 PM, Amker.Cheng wrote: >>  Ah, I forgot pro/epilogue generation, but I think that's the only other >> thing that happens after reload.  That is a special case: it has to generate >> strict rtl that directly matches the insns it wants.  

Re: a peculiar fpload problem on an inferior processor

2010-05-07 Thread Amker.Cheng
>  Ah, I forgot pro/epilogue generation, but I think that's the only other > thing that happens after reload.  That is a special case: it has to generate > strict rtl that directly matches the insns it wants.  You'll probably have to > arrange for it to save at least one GPR early enough in the pro

Re: a peculiar fpload problem on an inferior processor

2010-05-07 Thread Amker.Cheng
>  It is possible.  Your expander can handle it before reload; to handle it > during and after reload, you need to implement a TARGET_SECONDARY_RELOAD hook. > > http://gcc.gnu.org/onlinedocs/gccint/Register-Classes.html#index-TARGET_005fSECONDARY_005fRELOAD-3974 > Thanks Dave, It works, but I found

a peculiar fpload problem on an inferior processor

2010-05-06 Thread Amker.Cheng
Hi : Our processor has an errata that the direct fpu load cannot work right, so I have to substitute instruction sequence "load_into_gpr ; move_gpr_into_fpr" for direct fpload insn. Currently I thought of two potential methods as following: method 1: step1 : keep a scratch register when e

Re: split lui_movf pattern on mips?

2010-05-03 Thread Amker.Cheng
> It's the encoding of 1.0f (single precision).  The point is that we want > something we can safely compare with 0.0f using floating-point instructions. > "Safe" means "without generating any kind of exception", so a subnormal > representation like 0x0001 isn't acceptable.  1.0f seems as good

split lui_movf pattern on mips?

2010-04-29 Thread Amker.Cheng
HI: There is comment on lui_movf in mips.md like following, ;; because we don't split it. FIXME: we should split instead. I can split it into a move and a condmove(movesi_on_cc) insns , like (define_split [(set (match_operand:CC 0 "d_operand" "") (match_operand:CC 1 "fcc_reload_opera

Re: pattern "s_" not used when generating rtl for float comparison on mips?

2010-04-29 Thread Amker.Cheng
> Indeed, looking at GCC 4.5 there's no cstore expander for floating-point > variables.  Maybe you can make a patch! :-) > yes, it seems gcc always generates set/compare/jump/set sequence, then optimizes it out in if-convert pass. Maybe it was left behind by early mips1, which has no conditional mo

Re: pattern "s_" not used when generating rtl for float comparison on mips?

2010-04-27 Thread Amker.Cheng
> > You can get the RTL for these patterns when expanding stores like > >   a = (b < c); > > In this case, GCC tries to avoid a conditional branch and (I suppose you are > on GCC <4.5) instead of cmp and b you go through cmp and > s.  cmp does nothing but stashing away its operands, while > s expan

pattern "s_" not used when generating rtl for float comparison on mips?

2010-04-27 Thread Amker.Cheng
Hi : There is a pattern "define_insn "s_"" in mips md file, like (define_insn "s_" [(set (match_operand:CC 0 "register_operand" "=z") (swapped_fcond:CC (match_operand:SCALARF 1 "register_operand" "f") (match_operand:SCALARF 2 "register_operand" "f")))] "" "c

Re: why mult generated for unsigned int multiply on mips?

2010-04-08 Thread Amker.Cheng
> It would, however, be nice if you actually posted an answer to your > (now solved) question. That way, any casual reader may learn something > new. > Sorry for the unintentional offense, here comes the method: for 2's complement binary number x31x30...x0, unsigned value U = 2^(31)*x31 + 2^(30)*x3

Re: why mult generated for unsigned int multiply on mips?

2010-04-07 Thread Amker.Cheng
found the cause, sorry to disturb, please ignore this message. -- Best Regards.

why mult generated for unsigned int multiply on mips?

2010-04-06 Thread Amker.Cheng
Hi : I noticed that on mips, the signed form instruction of multiply is generated for unsigned integer multiply operation. for example, mult is used, rather than multu for following codes: unsigned int x, y, z; x = y * z; Is it reasonable to do so? Thanks. -- Best Regards.

Problem on handling fall-through edge in bb-reorder

2010-04-05 Thread Amker.Cheng
Hi All: I read codes in bb-reorder pass. normally it's fine to take the most probable basic block as the downward bb. unfortunately, the processor I'm working on is a little different. It has no pipeline stall when branches are taken, but does introduce stall when they are not taken. take an exa

Re: Puzzle about CFG on rtl during delay slot schedule

2010-04-03 Thread Amker.Cheng
> Cheng, can you explain what lead you to this "discovery", and what > you're trying to achieve? Thanks for all your enthusiastic explanation. Well, we are now trying to find our processor's critical timing path by running it at higher frequency than it was designed for. One timing prob we found i

Fwd: Puzzle about CFG on rtl during delay slot schedule

2010-04-02 Thread Amker.Cheng
> The CFG is not maintained during delay slot scheduling. This is, in > fact, a very old and well-known problem. Look for any e-mail on this > list that mentions reorg.c :-) > Thanks, further more , It seems cfg are not maintained after delay slot scheduling. also find that problem just before fina

Puzzle about CFG on rtl during delay slot schedule

2010-04-02 Thread Amker.Cheng
Hi : I'm wondering whether cfg is maintained properly during delay slot scheduling, Because when compiling libgcc/_divsc3.o, rtl dump in libgcc2.c.198r.mach has following lines: no bb for insn with uid = 293. deleting insn with uid = 690. deleting insn with uid = 904. .. (note 298 905 303

Re: Question on mips multiply patterns in md file

2010-03-18 Thread Amker.Cheng
> The reasoning here is > that if splitting will result in worse code, then we shouldn't have > accepted it in the first place.  If dropping this alternative results in > register allocator failures for some strange reason, then we accept it > and generate the 3 instruction sequence with a new defi

Re: Question on mips multiply patterns in md file

2010-03-16 Thread Amker.Cheng
> If you don't know anything about register class preferencing or reload as > yet, then this is probably not going to make much sense to you, but it isn't > anything important you need to worry about at this point.  It is a very > minor performance optimization. > It makes sense to me now, though I

Question on mips multiply patterns in md file

2010-03-15 Thread Amker.Cheng
Hi : I am studying multiplication-accumulate patterns for mips and noticed there are some changes when IRA was merged. There are two pattern which confused me, as : 1: In pattern "*mul_acc_si", there's constraint like "*?*?". what does this supposed to do? I could not connect "*?" with docu

Puzzle about mips pipeline description

2010-03-08 Thread Amker.Cheng
Hi All: In gcc internal, section 16.19.8, there is a rule about "define_insn_reservation" like: "`condition` defines what RTL insns are described by this construction. You should re- member that you will be in trouble if `condition` for two or more different `define_insn_ reservation` constructor

question about replace_in_call_usage in regmove.c

2010-01-01 Thread Amker.Cheng
Hi : In regmove.c there is function "replace_in_call_usage" called in fixup_match_1, It replaces dst register by src in call_insn, I suspect whether it is necessary Since comment of CALL_INSN_FUNCTION_USAGE says that no pseudo register can appear in it and seems src is pseudo register. further m

Re: Question about filling multi delay slots

2009-12-01 Thread Amker.Cheng
On Tue, Dec 1, 2009 at 5:31 AM, Jeff Law wrote: > On 11/25/09 07:34, Amker.Cheng wrote: > > First, it's worth noting very few targets support multiple delay slots and > as a result that code isn't tested nearly as well as handling of single > delay slots. > > I&#

Question about filling multi delay slots

2009-11-25 Thread Amker.Cheng
Hi All : It's possible to define multi delay slots for branch insns by using define_delay, and different slot should satisfy its own attribute test "delay-n". Here comes question, in function "fill_simple_delay_slots", seems it only uses slots_filled to record how many slots needs to fill, a

mis-set value for trial in function fill_simple_delay_slots?

2009-11-22 Thread Amker.Cheng
Hi : In function fill_simple_delay_slots, there is following codes: >starts here /* If there are slots left to fill and our search was stopped by an unconditional branch, try the insn at the branch target. We can redirect the bra

Puzzles about implementation of bb-reorder pass

2009-10-28 Thread Amker.Cheng
Hi : The bb-reorder pass is relative simple comparing with others, but still I got following puzzles. 1 : the comment at top of the bb-reorder.c file says that : There are two parameters: Branch Threshold and Exec Threshold. If the edge to a successor of the actual basic block is low

Re: Problem when computing memory dependencies for scheduling pass1

2009-09-28 Thread Amker.Cheng
Thanks Eric Fisher, got the answer, Please ignore this message. -- Best Regards.

Problem when computing memory dependencies for scheduling pass1

2009-09-28 Thread Amker.Cheng
Hi all: I have found something strange when scheduling instructions. considering following piece of code: -c start int func(float x) { int r = 0; r = (*(unsigned int*)&x) >> 23; return r; } -c e

Re: what does the calling for min_insn_conflict_delay mean

2009-09-23 Thread Amker.Cheng
On Tue, Sep 22, 2009 at 11:50 PM, Vladimir Makarov wrote: > Ian Lance Taylor wrote: >> >> "Amker.Cheng" writes: >> >> >>> >>>   In function new_ready, it calls to min_insn_conflict_delay with >>> "min_insn_conflict_delay (

what does the calling for min_insn_conflict_delay mean

2009-09-20 Thread Amker.Cheng
Hi : In function new_ready, it calls to min_insn_conflict_delay with "min_insn_conflict_delay (curr_state, next, next)". But the function's comments say that it returns minimal delay of issue of the 2nd insn after issuing the 1st in given state. Why the last two parameter for the call are both "

Re: question about speculative scheduling in gcc

2009-09-20 Thread Amker.Cheng
On Sun, Sep 20, 2009 at 3:43 PM, Maxim Kuvyrkov wrote: > Amker.Cheng wrote: >> >> Hi : >> I'm puzzled when looking into speculative scheduling in gcc, the 4.2.4 >> version. >> >> First, I noticed the document describing IBM haifa instruction >> sc

question about speculative scheduling in gcc

2009-09-19 Thread Amker.Cheng
Hi : I'm puzzled when looking into speculative scheduling in gcc, the 4.2.4 version. First, I noticed the document describing IBM haifa instruction scheduler(as PowerPC Reference Compiler Optimization Project). It presents that the instruction motion from bb s(dominated by t) to t is speculative

Re: Is Non-Blocking cache supported in GCC?

2009-09-18 Thread Amker.Cheng
On Sat, Sep 19, 2009 at 1:17 AM, Janis Johnson wrote: > On Thu, 2009-09-17 at 21:48 -0700, Ian Lance Taylor wrote: > > There's also a prefetch built-in function; see > > http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#Other-Builtins > > It's been in GCC since 3.1. > > Janis > > Thank you all

Is Non-Blocking cache supported in GCC?

2009-09-17 Thread Amker.Cheng
Hi all: Recently I found two relative old papers about non-blocking cache, etc. which are : 1) Reducing memory latency via non-blocking and prefetching caches. BY Tien-Fu Chen and Jean-Loup Baer. 2) Data Prefetching:A Cost/Performance Analysis BY Chris Metcalf It seems the

Question about the difference between two instruction scheduling passes

2009-08-19 Thread Amker.Cheng
Hi all: I'm currently studying implementation of instruction sched in gcc. it is possible to schedule insns directly from queue in case there is nothing better to do and there are still vacant dispatch slots in the current cycle. Gcc only does this work in the second pass, but what's the point

Re: Help: does define_peephole still work in gcc-4.2.4

2009-05-11 Thread Amker.Cheng
It turns out there is a mistake in "out-template" of "define_peephole". So, Sorry for disturbing! -- Best Regards.

Help: does define_peephole still work in gcc-4.2.4

2009-05-11 Thread Amker.Cheng
Hi all: Currently I am studying peephole optimization in gcc. I defined a peephole using "define_peephole", but nothing happened. It seems gcc does do the pattern match work in codes surrounded by "HAVE_peephole", but codes from "out-template" in that "define_peephole" are not compiled into gc

Puzzle:where does gcc_cv_as come from?

2009-03-02 Thread Amker.Cheng
Hi all: Currently I'm building cross gcc for mips32 on winXp+cygwin. I tried both gcc 4.2.4 and 4.2.3 and there is a building problem with 4.2.4 gcc makefile normally issue shell command "echo 'exec $(ORIGINAL_AS_FOR_TARGET) "$$@"' >> as ; \" at around line 1370, but ORIGINAL_AS_FOR_TARGET defi