Can the size of pointers to data and text be different?
I am using gcc4.3.2. In our microcontroller, move instruction(mov reg, imm) can accept 16bits and 32bits immediate operand. The data memory size is less than 64KB, however, code memory size is larger than 64KB. The immediate operand may be addresses of variables in data sections and function pointers. The address of variables can be represented by 16bits. However, function pointers may be larger than 16bits. I'd like to use "mov reg, imm16" for addresses of variables and "mov reg, imm32" for function pointers. So that the code size can be a little bit smaller. Another way to understand the requirement is the size of pointers to data and text have to be different. How can I select appropriate mov for them? I tried to use LABEL_REF and SYMBOL_REL to distinguish between them, but it didn't help. It seems that function pointers are treated as symbols too. Are there any other cases that references to functions in text sections are used in data sections? Thanks. -Qifei Fan
Re: Randomization in gcc generates different assembly file
>> Which file or fucntion should I look into? Maybe I can work around in 4.3.2 > > Look into tree-ssa-alias.c and tree-ssa-structalias.c > >> What change in 4.5 fixed it? > > A complete rewrite of the above ... > > Richard. So is there easy way to work around in 4.3.2 to disable the randomization? I am not familiar with tree ssa. Hope there is. :( -- -Qifei Fan
Re: Randomization in gcc generates different assembly file
On Tue, May 10, 2011 at 6:23 PM, Richard Guenther wrote: > On Tue, May 10, 2011 at 12:08 PM, fanqifei wrote: >> Hi all, >> >> I am poring gcc 4.3.2 for a micro-controller and use it to compile C >> source code. >> I found that gcc is very sensitive to small changes in C source code >> even if the change doesn't affect any function of the source code. >> For example, a source file foo.c includes a header file foo.h. >> If one macro definition is added to foo.h and the macro is not used in >> foo.c. The assembly file foo.s is still changed and a few instructions >> swapped positions . >> I checked foo.c.129t.final_cleanup and found difference in it. >> >> I am wondering what caused the change in foo.o. Is there any >> randomization in gcc? >> How can I make the assembly file foo.s same no matter foo.h is changed or >> not? > > alias analysis behavior depends on DECL_UIDs for partitioning which > unfortunately shows this behavior in older releases (fixed as of GCC 4.5 > at least). > > Richard. Which file or fucntion should I look into? Maybe I can work around in 4.3.2 What change in 4.5 fixed it? Thanks. -Qifei Fan
Randomization in gcc generates different assembly file
Hi all, I am poring gcc 4.3.2 for a micro-controller and use it to compile C source code. I found that gcc is very sensitive to small changes in C source code even if the change doesn't affect any function of the source code. For example, a source file foo.c includes a header file foo.h. If one macro definition is added to foo.h and the macro is not used in foo.c. The assembly file foo.s is still changed and a few instructions swapped positions . I checked foo.c.129t.final_cleanup and found difference in it. I am wondering what caused the change in foo.o. Is there any randomization in gcc? How can I make the assembly file foo.s same no matter foo.h is changed or not? Thanks -Qifei Fan
Re: question about lshiftrt:DI when there are no 64bits in the processor
Thank you, Georg and Ian. I misunderstood the section16.2 of gcc internal manual and thought that the nameless insn (with * ) in .md file can be only used during rtl-->asm. The generated code is correct now. Thanks again! -- -Qifei Fan On Wed, Sep 15, 2010 at 9:27 PM, Georg Lay wrote: > fanqifei schrieb: >> Hi all, >> >> I am porting gcc to a microprocessor. There are no 64bits instructions >> in it. I added a small logical shift right optimization to the md >> file(see below). >> For the statement “k>>32” in which k is 64bits integer, the >> “define_expand” should fail because op2 is 32, not 1. >> However, I can see the lshiftrt:DI is still generated in rtl dumps >> even if I commented out the “define_expand”. If both “define_expand” >> and “define_isns” are commented out, the result is correct. >> >> .md file: >> >> ;; Special case for long x>>1, which can be expanded >> ;; using the carry bit shift-in instructions. x<<1 is already >> ;; expanded by the compiler into x+x, so no rules for long leftshift >> ;; necessary. >> ;; >> >> (define_expand "lshrdi3" >> [(set (match_operand:DI 0 "register_operand" ) >> (lshiftrt:DI (match_operand:DI 1 "register_operand") >> (match_operand:QI 2 "immediate_operand")))] >> "" >> { >> if ( GET_CODE(operands[2]) != CONST_INT ) { FAIL; } >> if ( INTVAL(operands[2]) != 1 ) { FAIL; } >> }) >> >> (define_insn "*lshrdi3S1" >> [(set (match_operand:DI 0 "register_operand" "=r") >> (lshiftrt:DI (match_operand:DI 1 "register_operand" "r") >> (match_operand:QI 2 "immediate_operand" "i")))] >> "" >> "lsr.w %H0 %H1 1;\;lsrc.w %M0 %M1 1;" >> [(set_attr "cc" "clobber")]) > [...] >> Why the instructions (47-51) are replaced by lshiftrt:DI when there is >> no lshrdi3 insn defined in md file? > > You actually /have/ defined an insn that allows lshiftrt:DI patterns for any > constant (even constants that are not known at compile time), namely your > "*lshrdi3S1" insn. Note that many passes construct new RTL out of RTL already > generated and try to match these constructs against some insns, most notably > pass insn-combine, insn splitters. If there is no match nothing happens. If > there is a match (and costs are lower etc.) the old pattern gets replaced by > the > new one. In your case that means that you have to disallow anything that has > op2 > not equal to 1. So you couls rewrite the insn in question to > > (define_insn "*lshrdi3S1" > [(set (match_operand:DI 0 "register_operand" "=r") > (lshiftrt:DI (match_operand:DI 1 "register_operand" "r") > (const_int 1)))] > "" > ...) > > or > > (define_insn "*lshrdi3S1" > [(set (match_operand:DI 0 "register_operand" "=r") > (lshiftrt:DI (match_operand:DI 1 "register_operand" "r") > (match_operand:QI 2 "const_int_operand" "n")))] > "operands[2] == const1_rtx" > ...) > > or any formulation you prefer. >
question about lshiftrt:DI when there are no 64bits in the processor
Hi all, I am porting gcc to a microprocessor. There are no 64bits instructions in it. I added a small logical shift right optimization to the md file(see below). For the statement “k>>32” in which k is 64bits integer, the “define_expand” should fail because op2 is 32, not 1. However, I can see the lshiftrt:DI is still generated in rtl dumps even if I commented out the “define_expand”. If both “define_expand” and “define_isns” are commented out, the result is correct. .md file: ;; Special case for long x>>1, which can be expanded ;; using the carry bit shift-in instructions. x<<1 is already ;; expanded by the compiler into x+x, so no rules for long leftshift ;; necessary. ;; (define_expand "lshrdi3" [(set (match_operand:DI 0 "register_operand" ) (lshiftrt:DI (match_operand:DI 1 "register_operand") (match_operand:QI 2 "immediate_operand")))] "" { if ( GET_CODE(operands[2]) != CONST_INT ) { FAIL; } if ( INTVAL(operands[2])!= 1 ) { FAIL; } }) (define_insn "*lshrdi3S1" [(set (match_operand:DI 0 "register_operand" "=r") (lshiftrt:DI (match_operand:DI 1 "register_operand" "r") (match_operand:QI 2 "immediate_operand""i")))] "" "lsr.w %H0 %H1 1;\;lsrc.w %M0 %M1 1;" [(set_attr "cc" "clobber")]) I can find out that the the insn is generated in pass_jump2(gcc/4.3.2/gcc/cfgcleanup.c). in p_b.c.137r.into_cfglayout: ;; Start of basic block ( 6) -> 5 ;; Pred edge 6 [97.1%] (code_label 69 44 47 5 57 "" [1 uses]) (note 47 69 50 5 [bb 5] NOTE_INSN_BASIC_BLOCK) (insn 50 47 48 5 p_b.c:491 (clobber (reg:DI 42 [ D.1783 ])) -1 (insn_list:REG_LIBCALL 51 (nil))) (insn 48 50 49 5 p_b.c:491 (set (subreg:SI (reg:DI 42 [ D.1783 ]) 0) (lshiftrt:SI (subreg:SI (reg/v:DI 35 [ k ]) 4) (const_int 0 [0x0]))) 54 {lshrsi3} (expr_list:REG_NO_CONFLICT (reg/v:DI 35 [ k ]) (nil))) (insn 49 48 51 5 p_b.c:491 (set (subreg:SI (reg:DI 42 [ D.1783 ]) 4) (const_int 0 [0x0])) 3 {*mov_mode_insn} (expr_list:REG_NO_CONFLICT (reg/v:DI 35 [ k ]) (nil))) (insn 51 49 52 5 p_b.c:491 (set (reg:DI 42 [ D.1783 ]) (reg:DI 42 [ D.1783 ])) 2 {movdi} (insn_list:REG_RETVAL 50 (expr_list:REG_EQUAL (lshiftrt:DI (reg/v:DI 35 [ k ]) (const_int 32 [0x20])) (nil but in p_b.c.138r.jump, insns between 47 and 51 are removed and insn51 is changed. ;; Start of basic block ( 5) -> 4 ;; Pred edge 5 [97.1%] (code_label 69 44 47 4 57 "" [1 uses]) (note 47 69 51 4 [bb 4] NOTE_INSN_BASIC_BLOCK) (insn 51 47 52 4 p_b.c:491 (set (reg:DI 42 [ D.1783 ]) (lshiftrt:DI (reg/v:DI 35 [ k ]) (const_int 32 [0x20]))) 58 {*lshrdi3S1} (nil)) I am wondering what’s the usage of REG_EQUAL? ( I have read gcc internal, but still don’t quite understand). Why the instructions (47-51) are replaced by lshiftrt:DI when there is no lshrdi3 insn defined in md file? Thanks. -- -Qifei Fan
Re: Fwd: constant hoisting out of loops
On Sun, Mar 21, 2010 at 3:43 AM, Jim Wilson wrote: > On Sun, 2010-03-21 at 03:40 +0800, fanqifei wrote: >> foor_expand_move is changed and it works now. >> However, I still don't understand why there was no such error if below >> condition was used and foor_expand_move was not changed. >> Both below condition and "(register_operand(operands[0], SImode) || >> register_operand(operands[1],SImode)) ..." does not accept mem&&mem. > > The define_expand is used for generating RTL. The RTL expander calls > the define_expand, which checks for MEM&CONST, and then falls through > generating the mem copy insn. > > The define_insn is used for matching RTL. After it has been generated, > we look at the movsi define_insn, and see that MEM&MEM doesn't match, so > you get an error for unrecognized RTL. > > The define_expand must always match the define_insn(s). They are used > in different phases, and they aren't checked against each other when gcc > is built. If there is a mismatch, then you get a run-time error for > unrecognized rtl. > > Jim > > > Great thanks for your explanation. I will look into the internal reason why they must match although that is beyond my work and what I know. Thanks again! -- -Qifei Fan http://freshtime.org
Re: Fwd: constant hoisting out of loops
On Sun, Mar 21, 2010 at 2:47 AM, Jim Wilson wrote: > On Sat, 2010-03-20 at 14:29 +0800, fanqifei wrote: >> I changed the condition in "*mov_insn_mode" to below: >> (register_operand(operands[0], SImode) || >> register_operand(operands[1],SImode)) > > I think you need the same change in foor_expand_move. I.e., if neither > the source or dest is a register, then you force the source into a > register. > > If you still have the mem&const check there, then mem&mem will > accidentally be accepted and generated. > > Jim > > > foor_expand_move is changed and it works now. However, I still don't understand why there was no such error if below condition was used and foor_expand_move was not changed. Both below condition and "(register_operand(operands[0], SImode) || register_operand(operands[1],SImode)) ..." does not accept mem&&mem. "(!( (memory_operand(operands[0], SImode) && (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], HImode) && (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], QImode) && (foor_const_operand_f(operands[1]))) ))" Thanks. -- -Qifei Fan http://freshtime.org
Re: Fwd: constant hoisting out of loops
On Sat, Mar 20, 2010 at 6:24 AM, Jim Wilson wrote: > On Fri, 2010-03-19 at 22:06 +0800, fanqifei wrote: >> 1. I add movsi expander in which function foor_expand_move is used to >> force the operands[1] to reg and emit the move insn. >> Another change is that in the define of insn “*mov_mode_insn" in which >> a condition is added to prevent a store const to mem RTL insn from >> being accepted. >> Are these changes necessary? > > Yes, this looks correct and necessary. > >> 2. Is is correct to use emit_move_insn in foor_expand_move? >> in mips.md, the function mips_emit_move called both emit_move_insn and >> emit_move_insn_1. But I don’t quite understand the comment above the >> function. > > This looks like the kind of thing you don't need to understand now. > Just call emit_move_insn, and worry about bizarre details like this > later. It isn't obvious to me why it is there either. > > Before reload, you can create new pseudo-regs at any time if you need to > load something into a register. After reload you can't. > emit_move_insn_1 assumes its operands are valid. emit_move_insn checks > to see if operands are valid and if not tries to fix them. Calling > emit_move_insn after reload will fail if you have an invalid operand > that needs to be loaded into a new pseudo. Calling emit_move_insn_1 > with invalid operands will fail a different way. It looks like the mips > port is trying to do something very tricky and subtle here. If you want > to understand it, you probably have to find the patch that added it. Or > find a testcase where it makes a difference. > >> 3. My understanding of the internal flow about the issue is: >> The named insn “movsi” is used to generate RTL insn list from parse >> tree. The insn pattern “set mem, const” is expanded by function >> foor_expand_move(). For other forms of “set” insns, the template given >> in the pattern is inserted. Then the insn "*mov_mode_insn" is used to >> generate assembler code. In the generation, the condition of >> mov_mode_insn is checked. >> I am not fully confident the understanding is correct. > > That seems correct. movsi is used for generating RTL. mov_mode_insn is > used for matching RTL. > >> (define_insn "*mov_mode_insn" >> [(set >> (match_operand:BWD 0 "nonimmediate_operand" "=r,m,r,r,r,r,r,r,x,r") >> (match_operand:BWD 1 "foor_move_source_operand" >> "Z,r,L,I,Q,P,ni,x,r,r"))] >> "(!( >> (memory_operand(operands[0], SImode) && >> (foor_const_operand_f(operands[1]))) >> ||(memory_operand(operands[0], HImode) && >> (foor_const_operand_f(operands[1]))) >> ||(memory_operand(operands[0], QImode) && >> (foor_const_operand_f(operands[1]))) > > BWD is presumably a mode macro. You can use mode to get the enum > mode name instead of having 3 copies of the test. Checking to reject > mem&&const is equivalent to checking to accept reg||reg. The latter > check is the conventional one and will be faster, as register_operand > does less work than memory_operand, and short-cut evaluation means we > only need one register_operand call in the common case. This assumes > that 'x' is some kind of register which seems likely. > >> predicates.md: >> (define_predicate "foor_const_operand" >> (match_test "foor_const_operand_f(op)")) > > You don't need the foor_const_operand function. You can just do > (match_code "const_int") > > Jim > > > I changed the condition in "*mov_insn_mode" to below: "( (register_operand(operands[0], SImode) || register_operand(operands[1],SImode)) ||(register_operand(operands[0], HImode) || register_operand(operands[1],HImode)) ||(register_operand(operands[0], QImode) || register_operand(operands[1],QImode)) )" Then there is an error during gcc build: ../.././gcc/tmplibgcc_fp_bit.c: In function '_fpadd_parts': ../.././gcc/tmplibgcc_fp_bit.c:740: error: unrecognizable insn: (insn 41 40 42 12 ../.././gcc/tmplibgcc_fp_bit.c:637 (set (mem/s:SI (reg/v/f:SI 43 [ tmp ]) [2 S4 A32]) (mem/s:SI (reg/v/f:SI 41 [ a ]) [2 S4 A32])) -1 (nil)) ../.././gcc/tmplibgcc_fp_bit.c:740: internal compiler error: in extract_insn, at recog.c:1990 Please submit a full bug report, Seems that the pattern "set mem,mem" is not recognized. But how was it recognized when I was using mem&&const? The constraints don't contain "m"&&"m". I don't know what extra information should I provide now. Thanks very much! -- -Qifei Fan http://freshtime.org
Fwd: constant hoisting out of loops
On Fri, Mar 19, 2010 at 1:06 AM, fanqifei wrote: > On Thu, Mar 18, 2010 at 2:30 AM, Jim Wilson wrote: >> On Wed, 2010-03-17 at 11:27 +0800, fanqifei wrote: >>> You are correct. The reload pass emitted the clr.w insn. >>> However, I can see loop opt passes after reload: >>> problem1.c.174r.loop2_invariant1 >> >> Not unless you have a modified toolchain. We don't run loop opt after >> register allocation. See the list of optimization passes in the FSF GCC >> passes.c file. loop2 occurs before ira/postreload. If you do have a >> modified toolchain, then I doubt that the current loop passes would work >> right, since they were designed to handle pseudo-regs not hard-regs. >> >> Jim >> >> >> > That passes were added by me more than two months ago. I thought these > passes could perform the optimization of hoisting constant out of > loop. > I just removed them. Thanks very much for your help! > Now I am working on the movsi expander. > > -- > -Qifei Fan > http://freshtime.org > Seems like that my change works now. The clr.w insn is now outside of the loop. Related code is below. There are still some questions: 1. I add movsi expander in which function foor_expand_move is used to force the operands[1] to reg and emit the move insn. Another change is that in the define of insn “*mov_mode_insn" in which a condition is added to prevent a store const to mem RTL insn from being accepted. Are these changes necessary? 2. Is is correct to use emit_move_insn in foor_expand_move? in mips.md, the function mips_emit_move called both emit_move_insn and emit_move_insn_1. But I don’t quite understand the comment above the function. Function mips_emit_move() in mips.md: /* Emit a move from SRC to DEST. Assume that the move expanders can handle all moves if !can_create_pseudo_p (). The distinction is important because, unlike emit_move_insn, the move expanders know how to force Pmode objects into the constant pool even when the constant pool address is not itself legitimate. */ rtx mips_emit_move (rtx dest, rtx src) { return (can_create_pseudo_p () ? emit_move_insn (dest, src) : emit_move_insn_1 (dest, src)); } 3. My understanding of the internal flow about the issue is: The named insn “movsi” is used to generate RTL insn list from parse tree. The insn pattern “set mem, const” is expanded by function foor_expand_move(). For other forms of “set” insns, the template given in the pattern is inserted. Then the insn "*mov_mode_insn" is used to generate assembler code. In the generation, the condition of mov_mode_insn is checked. I am not fully confident the understanding is correct. related code: foor.md: movsi expander: (define_expand "movsi" [(set (match_operand:SI 0 "nonimmediate_operand" "") (match_operand:SI 1 "foor_move_source_operand" ""))] "" { if (foor_expand_move (SImode, operands)) DONE; }) (define_insn "*mov_mode_insn" [(set (match_operand:BWD 0 "nonimmediate_operand" "=r,m,r,r,r,r,r,r,x,r") (match_operand:BWD 1 "foor_move_source_operand" "Z,r,L,I,Q,P,ni,x,r,r"))] "(!( (memory_operand(operands[0], SImode) && (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], HImode) && (foor_const_operand_f(operands[1]))) ||(memory_operand(operands[0], QImode) && (foor_const_operand_f(operands[1]))) ))" "@ %L1 %0 %1; %S0 %0 %1; … predicates.md: (define_predicate "foor_const_operand" (match_test "foor_const_operand_f(op)")) foor.c: bool foor_expand_move(enum machine_mode mode, rtx *operands) { /* Handle sets of MEM first. */ if ((GET_CODE (operands[0]) == MEM)&&(GET_CODE(operands[1])==CONST_INT)) { emit_move_insn ((operands[0]), force_reg (mode, operands[1])); return true; } return false; } Check whether the operand is const: bool foor_const_operand_f(rtx x) { if ((GET_CODE (x) == CONST_INT)) { return true; } return false; } Thanks! -- -Qifei Fan http://freshtime.org
Re: constant hoisting out of loops
On Thu, Mar 18, 2010 at 2:30 AM, Jim Wilson wrote: > On Wed, 2010-03-17 at 11:27 +0800, fanqifei wrote: >> You are correct. The reload pass emitted the clr.w insn. >> However, I can see loop opt passes after reload: >> problem1.c.174r.loop2_invariant1 > > Not unless you have a modified toolchain. We don't run loop opt after > register allocation. See the list of optimization passes in the FSF GCC > passes.c file. loop2 occurs before ira/postreload. If you do have a > modified toolchain, then I doubt that the current loop passes would work > right, since they were designed to handle pseudo-regs not hard-regs. > > Jim > > > That passes were added by me more than two months ago. I thought these passes could perform the optimization of hoisting constant out of loop. I just removed them. Thanks very much for your help! Now I am working on the movsi expander. -- -Qifei Fan http://freshtime.org
Re: constant hoisting out of loops
On Mon, Mar 15, 2010 at 5:24 AM, Jim Wilson wrote: > On 03/10/2010 10:48 PM, fanqifei wrote: >> >> For below piece of code, the instruction "clr.w a15" obviously doesn't >> belong to the inner loop. >> 6: bd f4 clr.w a15; #clear to zero >> 8: 80 af 00 std.w a10 0x0 a15; > > There is info lacking here. Did you compile with optimization? What does > the RTL look like before and after the loop opt passes? > > I'd guess that your movsi pattern is defined wrong. You probably have > predicates that allow either registers or constants in the set source, which > is normal, and constraints that only allow registers when the dest is a mem. > But constraints are only used by the reload pass, so a store zero to mem > rtl insn will be generated early, and then fixed late during the reload > pass. So the loop opt did not move the clear insn out of the loop because > there was no clear insn at this time. > > The way to fix this is to add a condition to the movsi pattern that excludes > this case. For instance, something like this: > "(register_operand (operands[0], SImode) > || register_operand (operands[1], SImode))" > This will prevent a store zero to mem RTL insn from being accepted. In > order to make this work, you need to make movsi an expander that accepts > anything, and then forces the source to a register if you have a store > constant to memory. See for instance the sparc_expand_move function or the > mips_legitimize_move function. > > Use -da (old) or -fdump-rtl-all (new) to see the RTL dumps to see what is > going on. > > Jim > It's compiled with -O2. You are correct. The reload pass emitted the clr.w insn. However, I can see loop opt passes after reload: problem1.c.174r.loop2_invariant1 problem1.c.174r.redo_loop2_invariant problem1.c.175r.loop2_unswitch problem1.c.177r.redo_loop2_invariant After reload pass, the clr.w insn is in the loop. And after above loop2 passes, the insn is not moved outside of the loop. I am not sure the issue is in these loop2 passes. I guess there is. For the definition of movsi expander, I will try to do what you pointed out. (I am not very familiar with these code and that may take me some time.) current definition of mov pattern: (define_insn "mov" [(set (match_operand:BWD 0 "nonimmediate_operand" "=r,m,r,r,r,r,r,r,x,r") (match_operand:BWD 1 "move_source_operand" "Z,r,L,I,Q,P,ni,x,r,r"))] "" "@ %L1 %0 %1; %S0 %0 %1; clr %0; mv %0 %1; ... ... Thanks! -- -Qifei Fan http://freshtime.org
constant hoisting out of loops
I am porting gcc 4.3.2 to my own micro-controller. For below piece of code, the instruction "clr.w a15" obviously doesn't belong to the inner loop. Compiler should be smart enough to move it to the beginning of the function. How I can hoist the constant out of loops? Maybe the costs functions have to be changed, but I don't know how. Thanks. C code: void memzero_aligned(uint* ptr, uint size) { uint ptr_end = (uint)ptr + size; while ((uint)ptr < ptr_end) { *ptr++ = 0; } } Disassembly code: <_memzero_aligned>: 0: bc ab 90 add.w a9 a10 a11; 3: f4 0e 0bbra 0xe; #branch unconditionally, no delay slot 6: bd f4clr.w a15; #clear to zero 8: 80 af 00std.w a10 0x0 a15; b: 90 aa 04 add.w a10 a10 0x4; e: b8 a9 06 cmp.w a10 a9; 11: f4 08 f5brc 0x6; 14: f8 00ret; -- -Qifei Fan http://freshtime.org
insn length attribute and code size optimization
According to the internal manual, insn length attribute can be used to to calculate the length of emitted code chunks when verifying branch distances. Can it be used in code size optimization? I may change TARGET_RTX_COSTS in my gcc port and return costs regarding the ins lengths. I can see code snippets for this purpose in i386/i386.c, however, the insn sizes are hard coded in array size_cost. Are there any other places where the insn length impacts insn selection? -- -Qifei Fan
Re: Help-The possible places where insn is splitted in greg pass
2010/1/27 fanqifei : > 2010/1/25 Ulrich Weigand : >> Qifei Fan wrote: >> >>> > But insn#479 is not recognized by recog() in insn-recog.c and the >>> > compilation failed. (recog only recognizes RTL defined in md, right?) >>> > Here the backtrace is >>> > reload--->cleanup_subreg_operands--->extract_insn_cached--->extract_insn-= >>> -->recog_memoized--->recog. >>> > There is no machine instruction(r3=3Dr1*4+r2) match the pattern of >>> > insn#479. Though there is pattern r3=3Dmem(r1*4+r2). >>> > I don=92t quite understand the generation of reload information. >> >> There's two issues here. The first issue is that reload makes the >> fundamental assumption that everything that is a valid address, can >> be loaded into a register as well, if necessary. On many platforms >> this is true, either because there is some sort of "load address" >> instruction, or because the form of valid addresses matches standard >> arithmetic instruction patterns. Reload will simply emit a naked >> "set" of some register to the address -- if the back-end doesn't >> support that, you'll get the failure you saw. >> >> If this doesn't work on your particular platform, you could either >> try to set things up so that reload never thinks it needs to reload >> an address (but this may be difficult to achieve). The safe option >> would be to tell reload how to achieve computing an address by >> providing a "secondary reload" pattern. See e.g. s390_secondary_reload >> (in config/s390/s390.c) and the associated "reload_plus" pattern. >> >> The second issue is as you notice here: >> >>> Actually the second reload is not needed if there is already the first relo= >>> ad. >>> If (plus:SI (reg/f:SI 16 SP) (const_int 96 [0x60]) is replaced by >>> (reg:SI 12 a12), then (plus:SI (mult:SI (reg:SI 9 a9 [204]) >>> (const_int 4 [0x4])) (reg:SI 12 a12) ) is a valid memory address. >>> But in function find_reloads, I can=92t find the related code that >>> deciding whether the second reload should be generated by regarding >>> the previous reload. The function is too complex. :-( >> >> The first reload, loading sp + 96 into a register, is generated from >> within find_reloads_address. After this is done, it is assumed that >> the address is now valid. >> >> However, something else later in find_reloads apparently assumes there >> is still some problem with the operand, and decides to reload the >> whole address. It is hard to say exactly what the problem is, without >> seeing the insn constraints, but the most likely cause seems to be that >> this instruction pattern does not have a general "m" constraint, but >> a more restricted memory constraint. >> >> If this is the case, the back-end procedure called to verify the >> constraint probably rejects it. This runs into another fundamental >> assumption reload makes: it assumes such procedures take other >> actions done by reload into account implicitly. This means the >> constraint checker ought to *accept* addresses of the form >> reg*const + (sp + const) >> because it ought to know that reload will already load the (sp + const) >> into a register anyway. >> >> If this is *not* the case, please send me the instruction pattern >> and constraints for the insn pattern that matched insn 320 before >> reload so I can investigate in more detail. >> >> (Please note that I'm currently travelling with restricted access >> to email, so it might be a couple of days before I'm able to reply; >> sorry ...) >> >> Bye, >> Ulrich >> >> -- >> Dr. Ulrich Weigand >> GNU Toolchain for Linux on System z and Cell BE >> ulrich.weig...@de.ibm.com >> > > For the second issue, there is indeed a strict constraint(back-end > procedure) that rejects the pattern. > The back-end procedure is composed of macros like > EXTRA_MEMORY_CONSTRAINT/EXTRA_CONSTRAINT. These macros are defined in > config/cpu.c and used in around Line3376 of reload.c(gcc4.3.2). > So it's the constraint checker's job to know whether reload will load > the (sp+const) into a register and use such information to decide > whether to accept the pattern or not, right? > Is there any other architecture which checks address by using previous > determined reload info? > I think this may be the proper way to resolve this problem. It may be > easy to implement too. > > I will dig into all the issues and possible options, and provide more > information later. > As I am not familiar with it, it may take some time. > > Thanks very much!! > -Qifei Fan > I have modified the constraint checker in which previous generated reloads are considered. Now the error is gone, from the result of reload I can see that the reload is correct. However the instruction is not in the final assembly code. It may be optimized away. So I am not sure this change is really correct. -Qifei Fan
Re: GCC-How does the coding style affect the insv pattern recognization?
2010/1/18 Adam Nemet : > fanqifei writes: >> Paolo Bonzini said that insv instruction might be synthesized >> later by combine. But combine only works on at most 3 instructions and >> insv is not generated in such case. >> So exactly when will the insv pattern be recognized and how does >> the coding style affect it? > > Sorry for jumping in late. See make_file_assigment in combine.c. > > The problem usually is that: > > (set A (ior (and B C1) OTHER)) > > can only be turned into a bit-insertion if A and B happen to be the same > pseudos. > > Adam > I did found such kind of pattern for some simple C statements in rtl dump. Unfortunately, A and B are not same. Is it possible and easy to move B to A firstly and replace B with A in the insn? Anyway, this is not required if it's impracticable. Qifei Fan
Re: Help-The possible places where insn is splitted in greg pass
2010/1/25 Ulrich Weigand : > Qifei Fan wrote: > >> > But insn#479 is not recognized by recog() in insn-recog.c and the >> > compilation failed. (recog only recognizes RTL defined in md, right?) >> > Here the backtrace is >> > reload--->cleanup_subreg_operands--->extract_insn_cached--->extract_insn-= >> -->recog_memoized--->recog. >> > There is no machine instruction(r3=3Dr1*4+r2) match the pattern of >> > insn#479. Though there is pattern r3=3Dmem(r1*4+r2). >> > I don=92t quite understand the generation of reload information. > > There's two issues here. The first issue is that reload makes the > fundamental assumption that everything that is a valid address, can > be loaded into a register as well, if necessary. On many platforms > this is true, either because there is some sort of "load address" > instruction, or because the form of valid addresses matches standard > arithmetic instruction patterns. Reload will simply emit a naked > "set" of some register to the address -- if the back-end doesn't > support that, you'll get the failure you saw. > > If this doesn't work on your particular platform, you could either > try to set things up so that reload never thinks it needs to reload > an address (but this may be difficult to achieve). The safe option > would be to tell reload how to achieve computing an address by > providing a "secondary reload" pattern. See e.g. s390_secondary_reload > (in config/s390/s390.c) and the associated "reload_plus" pattern. > > The second issue is as you notice here: > >> Actually the second reload is not needed if there is already the first relo= >> ad. >> If (plus:SI (reg/f:SI 16 SP) (const_int 96 [0x60]) is replaced by >> (reg:SI 12 a12), then (plus:SI (mult:SI (reg:SI 9 a9 [204]) >> (const_int 4 [0x4])) (reg:SI 12 a12) ) is a valid memory address. >> But in function find_reloads, I can=92t find the related code that >> deciding whether the second reload should be generated by regarding >> the previous reload. The function is too complex. :-( > > The first reload, loading sp + 96 into a register, is generated from > within find_reloads_address. After this is done, it is assumed that > the address is now valid. > > However, something else later in find_reloads apparently assumes there > is still some problem with the operand, and decides to reload the > whole address. It is hard to say exactly what the problem is, without > seeing the insn constraints, but the most likely cause seems to be that > this instruction pattern does not have a general "m" constraint, but > a more restricted memory constraint. > > If this is the case, the back-end procedure called to verify the > constraint probably rejects it. This runs into another fundamental > assumption reload makes: it assumes such procedures take other > actions done by reload into account implicitly. This means the > constraint checker ought to *accept* addresses of the form > reg*const + (sp + const) > because it ought to know that reload will already load the (sp + const) > into a register anyway. > > If this is *not* the case, please send me the instruction pattern > and constraints for the insn pattern that matched insn 320 before > reload so I can investigate in more detail. > > (Please note that I'm currently travelling with restricted access > to email, so it might be a couple of days before I'm able to reply; > sorry ...) > > Bye, > Ulrich > > -- > Dr. Ulrich Weigand > GNU Toolchain for Linux on System z and Cell BE > ulrich.weig...@de.ibm.com > For the second issue, there is indeed a strict constraint(back-end procedure) that rejects the pattern. The back-end procedure is composed of macros like EXTRA_MEMORY_CONSTRAINT/EXTRA_CONSTRAINT. These macros are defined in config/cpu.c and used in around Line3376 of reload.c(gcc4.3.2). So it's the constraint checker's job to know whether reload will load the (sp+const) into a register and use such information to decide whether to accept the pattern or not, right? Is there any other architecture which checks address by using previous determined reload info? I think this may be the proper way to resolve this problem. It may be easy to implement too. I will dig into all the issues and possible options, and provide more information later. As I am not familiar with it, it may take some time. Thanks very much!! -Qifei Fan
Re: Help-The possible places where insn is splitted in greg pass
2010/1/16 fanqifei : > 2010/1/15 Ian Lance Taylor : >> There are many places where that insn could be generated, so it's >> pretty hard to answer your question as asked. >> >> I recommend setting a breakpoint on make_insn_raw if >> cfun->emit->x_cur_insn_uid == 479. Then a backtrace will show you >> what is creating the insn. >> >> Ian >> > That insn was generated in subst_reloads() called by reload_as_needed > in reload1.c. > > In greg pass, the instruction#320 needs to be splitted. The cpu > supports the memory address mode mem(r1*4+r2). > (insn 320 308 309 19 a.c:381 (set (reg:SI 207 [ .wrData ]) > (mem/s:SI (plus:SI (mult:SI (reg:SI 204) > (const_int 4 [0x4])) > (reg/f:SI 234)) [5 .wrData+0 S4 A32])) 3 > {movsi} (expr_list:REG_DEAD (reg:SI 204) > (nil))) > > In find_reloads() (called by reload_as_needed()), following reload > information was generated. > (insn 320 308 309 19 a.c:381 (set (reg:SI 14 a14 [orig:207 > .wrData ] [207]) > (mem/s:SI (plus:SI (mult:SI (reg:SI 9 a9 [204]) > (const_int 4 [0x4])) > (plus:SI (reg/f:SI 16 SP) > (const_int 96 [0x60]))) [5 .wrData+0 S4 > A32])) 3 {movsi} (expr_list:REG_DEAD (reg:SI 9 a9 [204]) > (nil))) > Reload 0: reload_in (SI) = (plus:SI (reg/f:SI 16 SP) > (const_int 96 [0x60])) > GENERAL_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 1) > reload_in_reg: (plus:SI (reg/f:SI 16 SP) > (const_int 96 [0x60])) > reload_reg_rtx: (reg:SI 12 a12) > Reload 1: reload_in (SI) = (plus:SI (mult:SI (reg:SI 9 a9 [204]) > (const_int 4 [0x4])) > (plus:SI (reg/f:SI 16 SP) > (const_int 96 [0x60]))) > GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1), inc by 4 > reload_in_reg: (plus:SI (mult:SI (reg:SI 9 a9 [204]) > (const_int 4 [0x4])) > (plus:SI (reg/f:SI 16 SP) > (const_int 96 [0x60]))) > reload_reg_rtx: (reg:SI 12 a12) > > After find_reloads() called, emit_reload_insns() generated insns to > reload operands. Then subst_reloads() substituted the reload regs > using the replacement information. > > The insn list after subst_reloads(): > (insn 475 308 477 19 a.c:381 (set (reg:SI 12 a12) > (const_int 96 [0x60])) -1 (nil)) > > (insn 477 475 478 19 a.c:381 (set (reg:SI 12 a12) > (reg/f:SI 16 SP)) -1 (nil)) > > (insn 478 477 479 19 a.c:381 (set (reg:SI 12 a12) > (plus:SI (reg:SI 12 a12) > (const_int 96 [0x60]))) -1 (expr_list:REG_EQUIV (plus:SI > (reg/f:SI 16 SP) > (const_int 96 [0x60])) > (nil))) > > (insn 479 478 320 19 a.c:381 (set (reg:SI 12 a12) > (plus:SI (mult:SI (reg:SI 9 a9 [204]) > (const_int 4 [0x4])) > (reg:SI 12 a12))) -1 (nil)) > > (insn 320 479 481 19 a.c:381 (set (reg:SI 14 a14 [orig:207 > .wrData ] [207]) > (mem/s:SI (reg:SI 12 a12) [5 .wrData+0 S4 A32])) 3 > {movsi} (expr_list:REG_DEAD (reg:SI 9 a9 [204]) > (nil))) > > But insn#479 is not recognized by recog() in insn-recog.c and the > compilation failed. (recog only recognizes RTL defined in md, right?) > Here the backtrace is > reload--->cleanup_subreg_operands--->extract_insn_cached--->extract_insn--->recog_memoized--->recog. > There is no machine instruction(r3=r1*4+r2) match the pattern of > insn#479. Though there is pattern r3=mem(r1*4+r2). > I don’t quite understand the generation of reload information. > What can I do next? > Thanks! > > Qifei Fan > Actually the second reload is not needed if there is already the first reload. If (plus:SI (reg/f:SI 16 SP) (const_int 96 [0x60]) is replaced by (reg:SI 12 a12), then (plus:SI (mult:SI (reg:SI 9 a9 [204]) (const_int 4 [0x4])) (reg:SI 12 a12) ) is a valid memory address. But in function find_reloads, I can’t find the related code that deciding whether the second reload should be generated by regarding the previous reload. The function is too complex. :-( Qifei Fan
Re: GCC-How does the coding style affect the insv pattern recognization?
2010/1/18 Adam Nemet : > Sorry for jumping in late. See make_file_assigment in combine.c. > > The problem usually is that: > > (set A (ior (and B C1) OTHER)) > > can only be turned into a bit-insertion if A and B happen to be the same > pseudos. > > Adam > Thank you, Adam. The problem is that before combine pass the statement is expressed in 6 insns. The insns can't be combined into the expected pattern (set A (ior (and B C1) OTHER)). Otherwise, make_field_assignment can do the job of simplifying the SET insn. Qifei Fan
Re: Help-The possible places where insn is splitted in greg pass
2010/1/15 Ian Lance Taylor : > There are many places where that insn could be generated, so it's > pretty hard to answer your question as asked. > > I recommend setting a breakpoint on make_insn_raw if > cfun->emit->x_cur_insn_uid == 479. Then a backtrace will show you > what is creating the insn. > > Ian > That insn was generated in subst_reloads() called by reload_as_needed in reload1.c. In greg pass, the instruction#320 needs to be splitted. The cpu supports the memory address mode mem(r1*4+r2). (insn 320 308 309 19 a.c:381 (set (reg:SI 207 [ .wrData ]) (mem/s:SI (plus:SI (mult:SI (reg:SI 204) (const_int 4 [0x4])) (reg/f:SI 234)) [5 .wrData+0 S4 A32])) 3 {movsi} (expr_list:REG_DEAD (reg:SI 204) (nil))) In find_reloads() (called by reload_as_needed()), following reload information was generated. (insn 320 308 309 19 a.c:381 (set (reg:SI 14 a14 [orig:207 .wrData ] [207]) (mem/s:SI (plus:SI (mult:SI (reg:SI 9 a9 [204]) (const_int 4 [0x4])) (plus:SI (reg/f:SI 16 SP) (const_int 96 [0x60]))) [5 .wrData+0 S4 A32])) 3 {movsi} (expr_list:REG_DEAD (reg:SI 9 a9 [204]) (nil))) Reload 0: reload_in (SI) = (plus:SI (reg/f:SI 16 SP) (const_int 96 [0x60])) GENERAL_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 1) reload_in_reg: (plus:SI (reg/f:SI 16 SP) (const_int 96 [0x60])) reload_reg_rtx: (reg:SI 12 a12) Reload 1: reload_in (SI) = (plus:SI (mult:SI (reg:SI 9 a9 [204]) (const_int 4 [0x4])) (plus:SI (reg/f:SI 16 SP) (const_int 96 [0x60]))) GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1), inc by 4 reload_in_reg: (plus:SI (mult:SI (reg:SI 9 a9 [204]) (const_int 4 [0x4])) (plus:SI (reg/f:SI 16 SP) (const_int 96 [0x60]))) reload_reg_rtx: (reg:SI 12 a12) After find_reloads() called, emit_reload_insns() generated insns to reload operands. Then subst_reloads() substituted the reload regs using the replacement information. The insn list after subst_reloads(): (insn 475 308 477 19 a.c:381 (set (reg:SI 12 a12) (const_int 96 [0x60])) -1 (nil)) (insn 477 475 478 19 a.c:381 (set (reg:SI 12 a12) (reg/f:SI 16 SP)) -1 (nil)) (insn 478 477 479 19 a.c:381 (set (reg:SI 12 a12) (plus:SI (reg:SI 12 a12) (const_int 96 [0x60]))) -1 (expr_list:REG_EQUIV (plus:SI (reg/f:SI 16 SP) (const_int 96 [0x60])) (nil))) (insn 479 478 320 19 a.c:381 (set (reg:SI 12 a12) (plus:SI (mult:SI (reg:SI 9 a9 [204]) (const_int 4 [0x4])) (reg:SI 12 a12))) -1 (nil)) (insn 320 479 481 19 a.c:381 (set (reg:SI 14 a14 [orig:207 .wrData ] [207]) (mem/s:SI (reg:SI 12 a12) [5 .wrData+0 S4 A32])) 3 {movsi} (expr_list:REG_DEAD (reg:SI 9 a9 [204]) (nil))) But insn#479 is not recognized by recog() in insn-recog.c and the compilation failed. (recog only recognizes RTL defined in md, right?) Here the backtrace is reload--->cleanup_subreg_operands--->extract_insn_cached--->extract_insn--->recog_memoized--->recog. There is no machine instruction(r3=r1*4+r2) match the pattern of insn#479. Though there is pattern r3=mem(r1*4+r2). I don’t quite understand the generation of reload information. What can I do next? Thanks! Qifei Fan
Re: Help-The possible places where insn is splitted in greg pass
2010/1/13 fanqifei : > Hi, > I am working on a micro controller and trying to port gcc(4.3.2) for it. > Not the compiling process runs into the following error: > a.c: In function 'task': > a.c:150: error: unrecognizable insn: > (insn 479 478 320 19 a:381 (set (reg:SI 12 a12) > (plus:SI (mult:SI (reg:SI 9 a9 [204]) > (const_int 4 [0x4])) > (reg:SI 12 a12))) -1 (nil)) > a.c:150: internal compiler error: in extract_insn, at recog.c:1990 > Please submit a full bug report, ... > > This insn is generated in greg pass from another insn: > (insn 320 308 309 19 a.c:381 (set (reg:SI 207 [ .wrData ]) > (mem/s:SI (plus:SI (mult:SI (reg:SI 204) > (const_int 4 [0x4])) > (reg/f:SI 234)) [5 .wrData+0 S4 A32])) 3 {movsi} > (expr_list:REG_DEAD (reg:SI 204) > (nil))) > I surfed the web a bit and found similar gcc bug report 37436 > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37436. > But basically they are not same. > > Can someone show me the possible places where insn#479 is generated in > reload.c or reload1.c? > Thanks! > > Qifei Fan > Is there anyone can help? Thanks very much! Qifei Fan
Re: GCC-How does the coding style affect the insv pattern recognization?
2010/1/13 Bingfeng Mei : > OOPs, I don't know that. Anyway, I won't count on GCC to > reliably pick up these complex patterns. In our port, we > implemented clz/ffs/etc as intrinsics though they are present as > standard patterns. > > Bingfeng Could you please show me the path of the source code that implement clz/ffs intrinsics of your processor? I would like to take it as a reference. Thank very much! Qifei Fan
Re: GCC-How does the coding style affect the insv pattern recognization?
2010/1/13 Bingfeng Mei : > Your instruction is likely too specific to be picked up by GCC. > You may use an intrinisc for it. > > Bingfeng but insv is a standard pattern name. the semantics of expression x= (x&0xFF00) | ((i<<16)&0x00FF); is exactly what insv can do. I all tried mips gcc cross compiler, and ins is also not generated. Intrinsic is a way to resolve this though. Maybe there is no other better way. BTW, There is a special case(the bit position is 0): 235: f0 97 fc mvi a9 -0x4; #move immediate to reg 238: ff e9 94 and a9 a14 a9; 23b: f0 95 02 or a9 0x2; The above three instructions can be replaced by mvi and insv. But the fact is not in the combine pass. Qifei Fan
Help-The possible places where insn is splitted in greg pass
Hi, I am working on a micro controller and trying to port gcc(4.3.2) for it. Not the compiling process runs into the following error: a.c: In function 'task': a.c:150: error: unrecognizable insn: (insn 479 478 320 19 a:381 (set (reg:SI 12 a12) (plus:SI (mult:SI (reg:SI 9 a9 [204]) (const_int 4 [0x4])) (reg:SI 12 a12))) -1 (nil)) a.c:150: internal compiler error: in extract_insn, at recog.c:1990 Please submit a full bug report, ... This insn is generated in greg pass from another insn: (insn 320 308 309 19 a.c:381 (set (reg:SI 207 [ .wrData ]) (mem/s:SI (plus:SI (mult:SI (reg:SI 204) (const_int 4 [0x4])) (reg/f:SI 234)) [5 .wrData+0 S4 A32])) 3 {movsi} (expr_list:REG_DEAD (reg:SI 204) (nil))) I surfed the web a bit and found similar gcc bug report 37436 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37436. But basically they are not same. Can someone show me the possible places where insn#479 is generated in reload.c or reload1.c? Thanks! Qifei Fan
GCC-How does the coding style affect the insv pattern recognization?
Hi, I am working on a micro controller and trying to port gcc(4.3.2) for it. There is insv instruction in our micro controller and I have add define_insn to machine description file. However, the insv instruction can only be generated when the code is written like below. If the code is written using logical shift and or operators, the insv instruction will not be generated. For the statement: x= (x&0xFF00) | ((i<<16)&0x00FF); 6 RTL instructions are generated after combine pass and 8 instructions are generated in the assembly file. Paolo Bonzini said that insv instruction might be synthesized later by combine. But combine only works on at most 3 instructions and insv is not generated in such case. So exactly when will the insv pattern be recognized and how does the coding style affect it? Is there any open bug report about this? struct test_foo { unsigned int a:18; unsigned int b:2; unsigned int c:12; }; struct test_foo x; unsigned int foo() { unsigned int a=x.b; x.b=2; return a; } Thanks! fanqifei