[BFIN] PR target/49862
Hi, I have committed this patch on trunk for PR target/49862. Regards, Jie PR target/49862 * config/bfin/bfin.c (hwloop_optimize): Fix unused variable warnings. (hwloop_pattern_reg): Fix set but not used warning. (bfin_reorg_loops): Remove unused parameter. (bfin_reorg): Update use of bfin_reorg_loops. Index: config/bfin/bfin.c === --- config/bfin/bfin.c (revision 185124) +++ config/bfin/bfin.c (working copy) @@ -3411,14 +3411,12 @@ static bool hwloop_optimize (hwloop_info loop) { basic_block bb; - hwloop_info inner; rtx insn, last_insn; rtx loop_init, start_label, end_label; rtx iter_reg, scratchreg, scratch_init, scratch_init_insn; rtx lc_reg, lt_reg, lb_reg; rtx seq, seq_end; int length; - unsigned ix; bool clobber0, clobber1; if (loop-depth MAX_LOOP_DEPTH) @@ -3840,12 +3838,11 @@ hwloop_fail (hwloop_info loop) static rtx hwloop_pattern_reg (rtx insn) { - rtx pat, reg; + rtx reg; if (!JUMP_P (insn) || recog_memoized (insn) != CODE_FOR_loop_end) return NULL_RTX; - pat = PATTERN (insn); reg = SET_DEST (XVECEXP (PATTERN (insn), 0, 1)); if (!REG_P (reg)) return NULL_RTX; @@ -3864,7 +3861,7 @@ static struct hw_doloop_hooks bfin_doloo hardware loops are generated. */ static void -bfin_reorg_loops (FILE *dump_file) +bfin_reorg_loops (void) { reorg_loops (true, bfin_doloop_hooks); } @@ -4601,7 +4598,7 @@ bfin_reorg (void) /* Doloop optimization */ if (cfun-machine-has_hardware_loops) -bfin_reorg_loops (dump_file); +bfin_reorg_loops (); workaround_speculation ();
Re: [BFIN] Hookize PREFERRED_RELOAD_CLASS
On 01/06/2012 12:07 PM, Anatoly Sokolov wrote: Hi, Jie. On Jan 6, 2012, Jie Zhangjzhang...@gmail.com wrote: Hi Anatoly, The patch looks OK. But I cannot apply your patch by saving your email as a patch file. If you take a look at this: I attach the patch. I can apply the attached patch. OK. Thank you. Jie
Re: [BFIN] Hookize PREFERRED_RELOAD_CLASS
Hi Anatoly, The patch looks OK. But I cannot apply your patch by saving your email as a patch file. If you take a look at this: http://gcc.gnu.org/cgi-bin/get-raw-msg?listname=gcc-patchesdate=2012-01msgid=4F05F12F.607%40post.ru you will find that there is a extra white space before each context line. But these extra white spaces do not show up in http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00262.html while the starting white space of the last line of the patch is missing. Regards, Jie On Thu, Jan 5, 2012 at 1:51 PM, Anatoly Sokolov ae...@post.ru wrote: Hi. This patch removes obsolete PREFERRED_RELOAD_CLASS macro from the BFIN back end in the GCC and introduces equivalent TARGET_PREFERRED_RELOAD_CLASS target hook. Compiled. Untested. OK to install? * config/bfin/bfin.h (PREFERRED_RELOAD_CLASS): Remove. * config/bfin/bfin.c (TARGET_PREFERRED_RELOAD_CLASS): Define. (bfin_preferred_reload_class): New function. Index: gcc/config/bfin/bfin.c === --- gcc/config/bfin/bfin.c (revision 182912) +++ gcc/config/bfin/bfin.c (working copy) @@ -2648,6 +2648,19 @@ split_load_immediate (rtx operands[]) return 0; } +/* Worker function for TARGET_PREFERRED_RELOAD_CLASS. */ + +static reg_class_t +bfin_preferred_reload_class (rtx x, reg_class_t rclass) +{ + if (GET_CODE (x) == POST_INC + || GET_CODE (x) == POST_DEC + || GET_CODE (x) == PRE_DEC) + return PREGS; + + return rclass; +} + /* Return true if the legitimate memory address for a memory operand of mode MODE. Return false if not. */ @@ -5771,6 +5784,9 @@ bfin_conditional_register_usage (void) #undef TARGET_RETURN_IN_MEMORY #define TARGET_RETURN_IN_MEMORY bfin_return_in_memory +#undef TARGET_PREFERRED_RELOAD_CLASS +#define TARGET_PREFERRED_RELOAD_CLASS bfin_preferred_reload_class + #undef TARGET_LEGITIMATE_ADDRESS_P #define TARGET_LEGITIMATE_ADDRESS_P bfin_legitimate_address_p Index: gcc/config/bfin/bfin.h === --- gcc/config/bfin/bfin.h (revision 182912) +++ gcc/config/bfin/bfin.h (working copy) @@ -707,16 +707,6 @@ enum reg_class GET_MODE_SIZE (MODE1) = UNITS_PER_WORD \ GET_MODE_SIZE (MODE2) = UNITS_PER_WORD)) -/* `PREFERRED_RELOAD_CLASS (X, CLASS)' - A C expression that places additional restrictions on the register - class to use when it is necessary to copy value X into a register - in class CLASS. The value is a register class; perhaps CLASS, or - perhaps another, smaller class. */ -#define PREFERRED_RELOAD_CLASS(X, CLASS) \ - (GET_CODE (X) == POST_INC \ - || GET_CODE (X) == POST_DEC \ - || GET_CODE (X) == PRE_DEC ? PREGS : (CLASS)) - /* Function Calling Conventions. */ /* The type of the current function; normal functions are of type -- Anatoly.
Re: [BFIN] Hookize REGISTER_MOVE_COST and MEMORY_MOVE_COST
Hi Anatoly, I cannot apply your patch to a lean tree. I tried to save your email as a text file, copy from thunderbird, copy from gmail, copy from the mailing list archive. But neither works. Regards, Jie 2011/12/23 Anatoly Sokolov ae...@post.ru: Hi. This patch removes obsolete REGISTER_MOVE_COST and MEMORY_MOVE_COST macros from the Blackfin back end in the GCC and introduces equivalent TARGET_REGISTER_MOVE_COST and TARGET_MEMORY_MOVE_COST target hooks. Untested. OK to install? * config/bfin/bfin.h (REGISTER_MOVE_COST, MEMORY_MOVE_COST): Remove. * config/bfin/bfin-protos.h (bfin_register_move_cost, bfin_memory_move_cost): Remove. * config/bfin/bfin.c (bfin_register_move_cost, bfin_memory_move_cost): Make static. Change arguments type from enum reg_class to reg_class_t and from int to bool. (TARGET_REGISTER_MOVE_COST, TARGET_MEMORY_MOVE_COST): Define. Index: gcc/config/bfin/bfin-protos.h === --- gcc/config/bfin/bfin-protos.h (revision 182658) +++ gcc/config/bfin/bfin-protos.h (working copy) @@ -85,9 +85,6 @@ extern bool bfin_longcall_p (rtx, int); extern bool bfin_dsp_memref_p (rtx); extern bool bfin_expand_movmem (rtx, rtx, rtx, rtx); -extern int bfin_register_move_cost (enum machine_mode, enum reg_class, - enum reg_class); -extern int bfin_memory_move_cost (enum machine_mode, enum reg_class, int in); extern enum reg_class secondary_input_reload_class (enum reg_class, enum machine_mode, rtx); Index: gcc/config/bfin/bfin.c === --- gcc/config/bfin/bfin.c (revision 182658) +++ gcc/config/bfin/bfin.c (working copy) @@ -2149,12 +2149,11 @@ bfin_vector_mode_supported_p (enum machi return mode == V2HImode; } -/* Return the cost of moving data from a register in class CLASS1 to - one in class CLASS2. A cost of 2 is the default. */ +/* Worker function for TARGET_REGISTER_MOVE_COST. */ -int +static int bfin_register_move_cost (enum machine_mode mode, - enum reg_class class1, enum reg_class class2) + reg_class_t class1, reg_class_t class2) { /* These need secondary reloads, so they're more expensive. */ if ((class1 == CCREGS !reg_class_subset_p (class2, DREGS)) @@ -2177,18 +2176,16 @@ bfin_register_move_cost (enum machine_mo return 2; } -/* Return the cost of moving data of mode M between a - register and memory. A value of 2 is the default; this cost is - relative to those in `REGISTER_MOVE_COST'. +/* Worker function for TARGET_MEMORY_MOVE_COST. ??? In theory L1 memory has single-cycle latency. We should add a switch that tells the compiler whether we expect to use only L1 memory for the program; it'll make the costs more accurate. */ -int +static int bfin_memory_move_cost (enum machine_mode mode ATTRIBUTE_UNUSED, - enum reg_class rclass, - int in ATTRIBUTE_UNUSED) + reg_class_t rclass, + bool in ATTRIBUTE_UNUSED) { /* Make memory accesses slightly more expensive than any register-register move. Also, penalize non-DP registers, since they need secondary @@ -5703,6 +5700,12 @@ bfin_conditional_register_usage (void) #undef TARGET_ADDRESS_COST #define TARGET_ADDRESS_COST bfin_address_cost +#undef TARGET_REGISTER_MOVE_COST +#define TARGET_REGISTER_MOVE_COST bfin_register_move_cost + +#undef TARGET_MEMORY_MOVE_COST +#define TARGET_MEMORY_MOVE_COST bfin_memory_move_cost + #undef TARGET_ASM_INTEGER #define TARGET_ASM_INTEGER bfin_assemble_integer Index: gcc/config/bfin/bfin.h === --- gcc/config/bfin/bfin.h (revision 182658) +++ gcc/config/bfin/bfin.h (working copy) @@ -975,29 +975,6 @@ typedef struct { /* Do not put function addr into constant pool */ #define NO_FUNCTION_CSE 1 -/* A C expression for the cost of moving data from a register in class FROM to - one in class TO. The classes are expressed using the enumeration values - such as `GENERAL_REGS'. A value of 2 is the default; other values are - interpreted relative to that. - - It is not required that the cost always equal 2 when FROM is the same as TO; - on some machines it is expensive to move between registers if they are not - general registers. */ - -#define REGISTER_MOVE_COST(MODE, CLASS1, CLASS2) \ - bfin_register_move_cost ((MODE), (CLASS1), (CLASS2)) - -/* A C expression for the cost of moving data of mode M between a - register and memory. A value of 2 is the default; this cost is - relative to those in `REGISTER_MOVE_COST'. - -
Re: [RFC] Cleanup DW_CFA_GNU_args_size handling
Hi, On Tue, Aug 2, 2011 at 6:32 PM, Richard Henderson r...@redhat.com wrote: I got Jeff Law to review the reload change on IRC and committed the composite patch. Tested on x86_64, i586, avr, and h8300. Most other tier1 targets ought not be affected, as this patch only applies to ACCUMULATE_OUTGOING_ARGS == 0 targets. This commit may have caused http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51552 Regards, Jie
Re: Ping: viewvc: python: RuntimeError: maximum recursion limit exceeded
On Sun, Sep 4, 2011 at 3:07 PM, Georg-Johann Lay a...@gjlay.de wrote: Hi, I'm getting the following error in viewvc for several days now: http://gcc.gnu.org/viewcvs/trunk/gcc/dse.c?view=markup An Exception Has Occurred Python Traceback RuntimeError: maximum recursion limit exceeded I reported a similar issue one year ago, but no one was interested to fix it. http://gcc.gnu.org/ml/gcc/2010-04/msg00943.html So I just did a rsync of GCC SVN repository and installed a ViewVC on my pc. It works fine. Jie
Re: __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ or does it?
On Sat, Aug 6, 2011 at 5:40 PM, Christopher Huang-Leaver zeong...@googlemail.com wrote: Output: small end first big end first gcc -v gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) I got the same result with g++-4.4 (4.4.6), g++-4.5 (4.5.3) on Debian testing. But with g++-4.6, I got small end first on my x86_64-linux-gnu machine. I think it's a bug, but it has been fixed in g++-4.6. Regards, Jie
Re: __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ or does it?
On Sat, Aug 6, 2011 at 9:35 PM, Jonathan Wakely jwakely@gmail.com wrote: On 6 August 2011 22:40, Christopher Huang-Leaver wrote: Hello, This isn't really a compiler bug, but it's something which the manual doesn't describe too well so I thought I would point this out. This page of the manual: http://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html#Common-Predefined-Macros That documentation refers to the latest sources in GCC trunk, not to GCC 4.4 Ha, so it's not a bug. It's a new feature, which doesn't exist before 4.6. Jie
Update my email address
Hi, I have committed this patch to update my email address. Jie 2011-04-21 Jie Zhang jzhang...@gmail.com * MAINTAINERS: Update my email address. Index: MAINTAINERS === --- MAINTAINERS (revision 172853) +++ MAINTAINERS (working copy) @@ -49,7 +49,7 @@ avr port Anatoly Sokolov ae...@post.ru avr port Eric Weddington eric.wedding...@atmel.com bfin port Bernd Schmidt ber...@codesourcery.com -bfin port Jie Zhang j...@codesourcery.com +bfin port Jie Zhang jzhang...@gmail.com cris port Hans-Peter Nilsson h...@axis.com fr30 port Nick Clifton ni...@redhat.com frv port Nick Clifton ni...@redhat.com
Re: [PATCH] use build_function_type_list in the bfin backend
On 04/20/2011 03:24 PM, Nathan Froyd wrote: As $SUBJECT suggests. Tested with cross to bfin-elf. OK to commit? OK. Thanks! Jie -Nathan * config/bfin/bfin.c (bfin_init_builtins): Call build_function_type_list instead of build_function_type. diff --git a/gcc/config/bfin/bfin.c b/gcc/config/bfin/bfin.c index 5d08437..03a833d 100644 --- a/gcc/config/bfin/bfin.c +++ b/gcc/config/bfin/bfin.c @@ -5967,7 +5967,7 @@ bfin_init_builtins (void) { tree V2HI_type_node = build_vector_type_for_mode (intHI_type_node, V2HImode); tree void_ftype_void -= build_function_type (void_type_node, void_list_node); += build_function_type_list (void_type_node, NULL_TREE); tree short_ftype_short = build_function_type_list (short_integer_type_node, short_integer_type_node, NULL_TREE);
Re: [ARM] [3/3] Implement TARGET_BUILTIN_DECL
Thank you for review, update and commit this patch set! Jie On 04/18/2011 10:04 AM, Richard Earnshaw wrote: On Mon, 2010-10-11 at 15:44 +0800, Jie Zhang wrote: This patch implements TARGET_BUILTIN_DECL for ARM. With the changes of the previous two patches, this one is straightforward. Is it OK? Sorry for the long time reviewing this set of patches. I've just tweaked it to bring it up to the current code base and committed it. It's largely unchanged from your submission apart from: 1) Updates to incorporate latest changes made by Richard Sandiford. 2) Minor tweak to simplyfy the iWMMXT builtins initialization. R. 2011-04-18 Jie Zhangj...@codesourcery.com Richard Earnshawrearn...@arm.com * arm.c (neon_builtin_type_bits): Remove. (typedef enum neon_builtin_mode): New. (T_MAX): Don't define. (typedef enum neon_builtin_datum): Remove bits, codes[], num_vars and base_fcode. Add mode, code and fcode. (VAR1, VAR2, VAR3, VAR4, VAR5, VAR6, VAR7, VAR8, VAR9 VAR10): Change accordingly. (neon_builtin_data[]): Change accordingly (arm_init_neon_builtins): Change accordingly. (neon_builtin_compare): Remove. (locate_neon_builtin_icode): Remove. (arm_expand_neon_builtin): Change accordingly. * arm.h (enum arm_builtins): Move to ... * arm.c (enum arm_builtins): ... here; and rearrange builtin code. * arm.c (arm_builtin_decl): Declare. (TARGET_BUILTIN_DECL): Define. (enum arm_builtins): Correct ARM_BUILTIN_MAX. (arm_builtin_decls[]): New. (arm_init_neon_builtins): Store builtin declarations in arm_builtin_decls[]. (arm_init_tls_builtins): Likewise. (arm_init_iwmmxt_builtins): Likewise. Refactor initialization code. (arm_builtin_decl): New.
Re: Find a new maintainer for option handling?
Hi, Any news about this? Regards, Jie On Tue, Jan 25, 2011 at 2:34 AM, Joseph S. Myers jos...@codesourcery.com wrote: On Mon, 17 Jan 2011, Gerald Pfeifer wrote: On Wed, 12 Jan 2011, Jie Zhang wrote: I agree. I think Joseph is the best candidate for the maintainer of the option handling since he made the most changes of gcc/opts-common.c. He is already the maintainer of the driver. If we unify these two maintainerships, we save one line of MAINTAINERS. :-) I am not so much concerned about that one line in MAINTAINERS, more finding someone who is willing to take on the role. I, too, think Joseph would be a great candidate, but it's his call whether he wants to. ;-) (I'll be happy to raise it on the SC in case.) I am willing to be considered for option handling maintainership or reviewership. -- Joseph S. Myers jos...@codesourcery.com
Re: Find a new maintainer for option handling?
Sorry, I just noticed that Joseph has been listed as the maintainer of option handling. Jie On 02/21/2011 11:56 PM, Jie Zhang wrote: Hi, Any news about this? Regards, Jie On Tue, Jan 25, 2011 at 2:34 AM, Joseph S. Myers jos...@codesourcery.com wrote: On Mon, 17 Jan 2011, Gerald Pfeifer wrote: On Wed, 12 Jan 2011, Jie Zhang wrote: I agree. I think Joseph is the best candidate for the maintainer of the option handling since he made the most changes of gcc/opts-common.c. He is already the maintainer of the driver. If we unify these two maintainerships, we save one line of MAINTAINERS. :-) I am not so much concerned about that one line in MAINTAINERS, more finding someone who is willing to take on the role. I, too, think Joseph would be a great candidate, but it's his call whether he wants to. ;-) (I'll be happy to raise it on the SC in case.) I am willing to be considered for option handling maintainership or reviewership. -- Joseph S. Myers jos...@codesourcery.com -- Jie Zhang
Re: Find a new maintainer for option handling?
Dear Steering Committee: Is unifying driver and option handling maintainership a good idea? On 01/12/2011 06:14 PM, Jie Zhang wrote: On 01/12/2011 06:07 PM, Richard Guenther wrote: On Wed, Jan 12, 2011 at 4:10 AM, Jie Zhangj...@codesourcery.com wrote: Dear Steering Committee: The current listed maintainer for option handling is: option handling Neil Booth n...@daikokuya.co.uk But I'm wondering if Neil is still active. There are no replies to my recent pings from that email address. The last recorded commit from him in GCC was on 2005-01-19, which was nearly 6 years ago. So I guess he might have not worked on GCC. If this is true, how about assigning a new maintainer for option handling? Option handling maintainership should be unified with driver maintainership IMNSHO, as they are closely related. I agree. I think Joseph is the best candidate for the maintainer of the option handling since he made the most changes of gcc/opts-common.c. He is already the maintainer of the driver. If we unify these two maintainerships, we save one line of MAINTAINERS. :-) Regards, -- Jie Zhang
Re: Find a new maintainer for option handling?
On 01/17/2011 10:35 AM, Gerald Pfeifer wrote: On Wed, 12 Jan 2011, Jie Zhang wrote: I agree. I think Joseph is the best candidate for the maintainer of the option handling since he made the most changes of gcc/opts-common.c. He is already the maintainer of the driver. If we unify these two maintainerships, we save one line of MAINTAINERS. :-) I am not so much concerned about that one line in MAINTAINERS, more Saving one line is just my stupid joke. :-P finding someone who is willing to take on the role. I, too, think Joseph would be a great candidate, but it's his call whether he wants to. ;-) (I'll be happy to raise it on the SC in case.) Thanks! -- Jie Zhang
Re: Find a new maintainer for option handling?
On 01/12/2011 06:07 PM, Richard Guenther wrote: On Wed, Jan 12, 2011 at 4:10 AM, Jie Zhangj...@codesourcery.com wrote: Dear Steering Committee: The current listed maintainer for option handling is: option handling Neil Booth n...@daikokuya.co.uk But I'm wondering if Neil is still active. There are no replies to my recent pings from that email address. The last recorded commit from him in GCC was on 2005-01-19, which was nearly 6 years ago. So I guess he might have not worked on GCC. If this is true, how about assigning a new maintainer for option handling? Option handling maintainership should be unified with driver maintainership IMNSHO, as they are closely related. I agree. I think Joseph is the best candidate for the maintainer of the option handling since he made the most changes of gcc/opts-common.c. He is already the maintainer of the driver. If we unify these two maintainerships, we save one line of MAINTAINERS. :-) Regards, -- Jie Zhang
Find a new maintainer for option handling?
Dear Steering Committee: The current listed maintainer for option handling is: option handling Neil Booth n...@daikokuya.co.uk But I'm wondering if Neil is still active. There are no replies to my recent pings from that email address. The last recorded commit from him in GCC was on 2005-01-19, which was nearly 6 years ago. So I guess he might have not worked on GCC. If this is true, how about assigning a new maintainer for option handling? Regards, -- Jie Zhang
Re: Behavior change of driver on multiple input assembly files
On 01/04/2011 07:33 AM, Ian Lance Taylor wrote: On Thu, Dec 30, 2010 at 9:07 PM, Jie Zhangj...@codesourcery.com wrote: For a minimal fix, I propose to change combinable fields of assembly languages in default_compilers[] to 0. See the attached patch gcc-not-combine-assembly-inputs.diff. I don't know why the combinable fields were set to 1 when --combine option was introduced. There is no explanation about that in that patch email.[2] Does anyone still remember? This patch is OK if it fixes PR 47137. Please mention the PR in the ChangeLog entry. Thanks. I have committed it now. I also posted it to gcc-patches mailing list with an updated ChangeLog entry: http://gcc.gnu.org/ml/gcc-patches/2011-01/msg00122.html -- Jie Zhang
Re: Behavior change of driver on multiple input assembly files
On 12/31/2010 01:07 PM, Jie Zhang wrote: I just found a behavior change of driver on multiple input assembly files. Previously (before r164357), for the command line gcc -o t t1.s t2.s , the driver will call assembler twice, once for t1.s and once for t2.s. After r164357, the driver will only call assembler once for t1.s and t2.s. Then if t1.s and t2.s have same symbol, assembler will report an error, like: t2.s: Assembler messages: t2.s:1: Error: symbol `.L1' is already defined I read the discussion on the mailing list starting by the patch email of r164357.[1] It seems that this behavior change is not the intention of that patch. And I think the previous behavior is more useful than the current behavior. So it's good to restore the previous behavior, isn't? For a minimal fix, I propose to change combinable fields of assembly languages in default_compilers[] to 0. See the attached patch gcc-not-combine-assembly-inputs.diff. I don't know why the combinable fields were set to 1 when --combine option was introduced. There is no explanation about that in that patch email.[2] Does anyone still remember? For an aggressive fix, how about removing the combinable field from struct compiler? If we change combinable fields of assembly languages in default_compilers[] to 0, only .go and @cpp-output set combinable to 1. I don't see any reason for difference between @cpp-output and .i. So if we can set combinable to 0 for .go, we have 0 for all compilers in default_compilers[], thus we can remove that field. Is there a reason to set 1 for .go? I also attached the aggressive patch gcc-remove-combinable-field.diff. Either patch is not tested. Which way should we go? The minimal fix has no regressions. But the aggressive one has a lot of regressions. [1] http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01322.html [2] http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01880.html Regards, -- Jie Zhang
Behavior change of driver on multiple input assembly files
I just found a behavior change of driver on multiple input assembly files. Previously (before r164357), for the command line gcc -o t t1.s t2.s , the driver will call assembler twice, once for t1.s and once for t2.s. After r164357, the driver will only call assembler once for t1.s and t2.s. Then if t1.s and t2.s have same symbol, assembler will report an error, like: t2.s: Assembler messages: t2.s:1: Error: symbol `.L1' is already defined I read the discussion on the mailing list starting by the patch email of r164357.[1] It seems that this behavior change is not the intention of that patch. And I think the previous behavior is more useful than the current behavior. So it's good to restore the previous behavior, isn't? For a minimal fix, I propose to change combinable fields of assembly languages in default_compilers[] to 0. See the attached patch gcc-not-combine-assembly-inputs.diff. I don't know why the combinable fields were set to 1 when --combine option was introduced. There is no explanation about that in that patch email.[2] Does anyone still remember? For an aggressive fix, how about removing the combinable field from struct compiler? If we change combinable fields of assembly languages in default_compilers[] to 0, only .go and @cpp-output set combinable to 1. I don't see any reason for difference between @cpp-output and .i. So if we can set combinable to 0 for .go, we have 0 for all compilers in default_compilers[], thus we can remove that field. Is there a reason to set 1 for .go? I also attached the aggressive patch gcc-remove-combinable-field.diff. Either patch is not tested. Which way should we go? [1] http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01322.html [2] http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01880.html Regards, -- Jie Zhang * gcc.c (default_compilers[]): Set combinable field to 0 for all assembly languages. Index: gcc.c === --- gcc.c (revision 168362) +++ gcc.c (working copy) @@ -935,11 +935,11 @@ static const struct compiler default_com {.i, @cpp-output, 0, 0, 0}, {@cpp-output, %{!M:%{!MM:%{!E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as), 0, 1, 0}, - {.s, @assembler, 0, 1, 0}, + {.s, @assembler, 0, 0, 0}, {@assembler, - %{!M:%{!MM:%{!E:%{!S:as %(asm_debug) %(asm_options) %i %A , 0, 1, 0}, - {.sx, @assembler-with-cpp, 0, 1, 0}, - {.S, @assembler-with-cpp, 0, 1, 0}, + %{!M:%{!MM:%{!E:%{!S:as %(asm_debug) %(asm_options) %i %A , 0, 0, 0}, + {.sx, @assembler-with-cpp, 0, 0, 0}, + {.S, @assembler-with-cpp, 0, 0, 0}, {@assembler-with-cpp, #ifdef AS_NEEDS_DASH_FOR_PIPED_INPUT %(trad_capable_cpp) -lang-asm %(cpp_options) -fno-directives-only\ @@ -952,7 +952,7 @@ static const struct compiler default_com %{!M:%{!MM:%{!E:%{!S:-o %|.s |\n\ as %(asm_debug) %(asm_options) %m.s %A #endif - , 0, 1, 0}, + , 0, 0, 0}, #include specs.h /* Mark end of table. */ Index: gcc.c === --- gcc.c (revision 168362) +++ gcc.c (working copy) @@ -847,8 +847,6 @@ struct compiler const char *cpp_spec; /* If non-NULL, substitute this spec for `%C', rather than the usual cpp_spec. */ - const int combinable; /* If nonzero, compiler can deal with -multiple source files at once (IMA). */ const int needs_preprocessing; /* If nonzero, source files need to be run through a preprocessor. */ }; @@ -876,29 +874,29 @@ static const struct compiler default_com were not present when we built the driver, we will hit these copies and be given a more meaningful error than file not used since linking is not done. */ - {.m, #Objective-C, 0, 0, 0}, {.mi, #Objective-C, 0, 0, 0}, - {.mm, #Objective-C++, 0, 0, 0}, {.M, #Objective-C++, 0, 0, 0}, - {.mii, #Objective-C++, 0, 0, 0}, - {.cc, #C++, 0, 0, 0}, {.cxx, #C++, 0, 0, 0}, - {.cpp, #C++, 0, 0, 0}, {.cp, #C++, 0, 0, 0}, - {.c++, #C++, 0, 0, 0}, {.C, #C++, 0, 0, 0}, - {.CPP, #C++, 0, 0, 0}, {.ii, #C++, 0, 0, 0}, - {.ads, #Ada, 0, 0, 0}, {.adb, #Ada, 0, 0, 0}, - {.f, #Fortran, 0, 0, 0}, {.F, #Fortran, 0, 0, 0}, - {.for, #Fortran, 0, 0, 0}, {.FOR, #Fortran, 0, 0, 0}, - {.ftn, #Fortran, 0, 0, 0}, {.FTN, #Fortran, 0, 0, 0}, - {.fpp, #Fortran, 0, 0, 0}, {.FPP, #Fortran, 0, 0, 0}, - {.f90, #Fortran, 0, 0, 0}, {.F90, #Fortran, 0, 0, 0}, - {.f95, #Fortran, 0, 0, 0}, {.F95, #Fortran, 0, 0, 0}, - {.f03, #Fortran, 0, 0, 0}, {.F03, #Fortran, 0, 0, 0}, - {.f08, #Fortran, 0, 0, 0}, {.F08, #Fortran, 0, 0, 0}, - {.r, #Ratfor, 0, 0, 0}, - {.p, #Pascal, 0, 0, 0}, {.pas, #Pascal, 0, 0, 0}, - {.java, #Java, 0, 0, 0}, {.class, #Java, 0, 0, 0}, - {.zip, #Java, 0, 0, 0}, {.jar, #Java, 0, 0, 0}, - {.go, #Go, 0, 1, 0}, + {.m, #Objective-C, 0, 0}, {.mi, #Objective-C, 0, 0}, + {.mm, #Objective-C++, 0, 0}, {.M, #Objective-C++, 0, 0}, + {.mii
Re: Question on ARM legitimate address for DImode
On 12/21/2010 06:12 PM, Richard Earnshaw wrote: On Tue, 2010-12-21 at 12:12 +0800, Jie Zhang wrote: Hi, While working on a bug, I found some code in ARM port that I don't understand. In ARM_LEGITIMIZE_RELOAD_ADDRESS and arm_legitimize_address, we allow a very small offset for DImode addressing. In ARM_LEGITIMIZE_RELOAD_ADDRESS: if (MODE == DImode || (MODE == DFmode TARGET_SOFT_FLOAT)) \ low = ((val 0xf) ^ 0x8) - 0x8; \ In arm_legitimize_address /* VFP addressing modes actually allow greater offsets, but for now we just stick with the lowest common denominator. */ if (mode == DImode || ((TARGET_SOFT_FLOAT || TARGET_VFP) mode == DFmode)) { low_n = n 0x0f; n= ~0x0f; if (low_n 4) { n += 16; low_n -= 16; } } AFAIK, we could use two LDRs, or one LDRD, or one VLDR to access DImode in memory when the address is in the form of (REG + CONST_INT). The offset ranges for these three cases are: LDR -4095,4091 LDRD -255,255 VLDR -1020,1020 (ADDR 3) == 0 The original code was designed to exploit LDM(IA,IB,DB,DA) which would have the offset ranges described. On earlier ARM chips (certainly up to and including ARM7TDMI) it was a significant win to do it that way (add a constant to the address register and then use LDM was faster than two LDR instructions). That's no-longer true on modern chips, LDM is often slower than individual LDR insns now. Thanks! Now I see. So I think the original code is still needed but should be used only for such earlier ARM chips. I will send the updated patch to gcc-patches mailing list. -- Jie Zhang
Question on ARM legitimate address for DImode
Hi, While working on a bug, I found some code in ARM port that I don't understand. In ARM_LEGITIMIZE_RELOAD_ADDRESS and arm_legitimize_address, we allow a very small offset for DImode addressing. In ARM_LEGITIMIZE_RELOAD_ADDRESS: if (MODE == DImode || (MODE == DFmode TARGET_SOFT_FLOAT)) \ low = ((val 0xf) ^ 0x8) - 0x8; \ In arm_legitimize_address /* VFP addressing modes actually allow greater offsets, but for now we just stick with the lowest common denominator. */ if (mode == DImode || ((TARGET_SOFT_FLOAT || TARGET_VFP) mode == DFmode)) { low_n = n 0x0f; n = ~0x0f; if (low_n 4) { n += 16; low_n -= 16; } } AFAIK, we could use two LDRs, or one LDRD, or one VLDR to access DImode in memory when the address is in the form of (REG + CONST_INT). The offset ranges for these three cases are: LDR -4095,4091 LDRD -255,255 VLDR -1020,1020 (ADDR 3) == 0 so the lowest common denominator is -1020,1020 (ADDR 3) == 0 if ! TARGET_LDRD -255,255 (ADDR 3) == 0 if TARGET_LDRD Both are much larger than what we have now in the ARM port. Did I miss some other cases? That two pieces of code are rather old (more than 15 years). The main code was added by svn: revision 7536 by erich, Thu Jun 23 16:02:41 1994 UTC in arm.h git: fac435147512513c1b8fa55bee061c8e3a767ba9 log: (LEGITIMIZE_ADDRESS): Push constants that will never be legitimate -- symbols and labels -- into registers. Handle DImode better. I checked out that revision to take a look but didn't find an obvious reason for such small index range. Did I miss something tricky? If there is nothing I missed, I'd like to propose the attached patch. Regards, -- Jie Zhang Index: config/arm/arm.c === --- config/arm/arm.c (revision 168085) +++ config/arm/arm.c (working copy) @@ -6221,13 +6221,9 @@ arm_legitimize_address (rtx x, rtx orig_ if (mode == DImode || ((TARGET_SOFT_FLOAT || TARGET_VFP) mode == DFmode)) { - low_n = n 0x0f; - n = ~0x0f; - if (low_n 4) - { - n += 16; - low_n -= 16; - } + HOST_WIDE_INT mask = (TARGET_LDRD ? 0xfc : 0x3fc); + low_n = (n = 0 ? (n mask) : -((-n) mask)); + n -= low_n; } else { Index: config/arm/arm.h === --- config/arm/arm.h (revision 168085) +++ config/arm/arm.h (working copy) @@ -1283,7 +1283,12 @@ enum reg_class HOST_WIDE_INT low, high; \ \ if (MODE == DImode || (MODE == DFmode TARGET_SOFT_FLOAT)) \ - low = ((val 0xf) ^ 0x8) - 0x8; \ + { \ + /* VFP addressing modes actually allow greater offsets, but for \ + now we just stick with the lowest common denominator. */ \ + HOST_WIDE_INT mask = (TARGET_LDRD ? 0xfc : 0x3fc); \ + low = (val = 0 ? (val mask) : -((-val) mask)); \ + } \ else if (TARGET_MAVERICK TARGET_HARD_FLOAT) \ /* Need to be careful, -256 is not a valid offset. */ \ low = val = 0 ? (val 0xff) : -((-val) 0xff); \
Re: Questions about selective scheduler and PowerPC
On 10/23/2010 01:50 AM, Pat Haugen wrote: On 10/20/2010 7:48 PM, Jie Zhang wrote: Running CPU2006, with the hack removed I see about a 1% improvement in specint (10% in 456.hmmer, a couple others in the 3% range, -3% 401.bzip2) and a 1% degradation in specfp (mainly due to a 13% degradation in 435.gromacs). But 454.calculix also fails for me (output miscompare), so assume we're generating incorrect code for some reason with the hack removed. Thanks for benchmarking! Since there is a bug in max_issue, issue_rate is not really honored. Could you try this patch http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01719.html with and without the hack? With your patch applied I see pretty similar results as before, except for a couple additional specint benchmarks that degraded a couple percent with the hack removed. Thanks for testing! Seems rs6000 port still has to keep that hack for now. -- Jie Zhang CodeSourcery
Re: Questions about selective scheduler and PowerPC
On 10/21/2010 04:08 AM, Pat Haugen wrote: On 10/18/2010 10:33 AM, Jeff Law wrote: On 10/18/10 09:22, David Edelsohn wrote: On Mon, Oct 18, 2010 at 8:27 AM, Nathan Froydfroy...@codesourcery.com wrote: On Mon, Oct 18, 2010 at 02:49:21PM +0800, Jie Zhang wrote: 3. The aforementioned rs6000 hack rs6000_issue_rate was added by 2003-03-03 David Edelsohnedels...@gnu.org * config/rs6000/rs6000.c (rs6000_multipass_dfa_lookahead): Delete. (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD): Delete. (rs6000_variable_issue): Do not return negative value. (rs6000_issue_rate): Uniformly set issue rate to 1 for first scheduling pass. , which was more than 7 years ago. Is this still needed now? I asked David about this on IRC several days ago. He indicated that it was necessary to prevent the first scheduling pass from unnecessarily increasing register pressure. I don't know whether anybody has actually tested it with recent GCC, though presumably it did help when it was installed. I am not sure when it last was re-checked, but it was checked after sched_pressure was added. When that option is not enabled, the issue_rate change still helped. Did anyone check this after Bernd's work to better handle allocation of double-word pseudos in IRA? That code should be handling the false conflicts created by movement of clobbers. Running CPU2006, with the hack removed I see about a 1% improvement in specint (10% in 456.hmmer, a couple others in the 3% range, -3% 401.bzip2) and a 1% degradation in specfp (mainly due to a 13% degradation in 435.gromacs). But 454.calculix also fails for me (output miscompare), so assume we're generating incorrect code for some reason with the hack removed. Thanks for benchmarking! Since there is a bug in max_issue, issue_rate is not really honored. Could you try this patch http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01719.html with and without the hack? Regards, -- Jie Zhang CodeSourcery
Re: Questions about selective scheduler and PowerPC
On 10/18/2010 03:41 PM, Andrey Belevantsev wrote: On 18.10.2010 11:31, Jie Zhang wrote: Hi Andrey, On 10/18/2010 03:13 PM, Andrey Belevantsev wrote: Hi Jie, On 18.10.2010 10:49, Jie Zhang wrote: When this error happens, FENCE_ISSUED_INSNS (fence) is 2 and issue_rate is 1. PowerPC 8540 is capable to issue 2 instructions in one cycle, but rs6000_issue_rate lies to scheduler that it can only issue 1 instruction before register relocation is done. See the following code: See PR 45352. I've tried to fix this in the selective scheduler by modeling the lying behavior in line with the haifa scheduler. Let me know if the last patch from the PR audit trail doesn't work for you. In addition, after the above patch goes in, I can make the selective scheduler not try to jump through the hoops with putting correct sched cycles on insns for targets which don't need it in their target_finish hook. I guess powerpc needs this though, but x86-64 (for which PR 45342 was opened) almost surely does not. Thanks for your reply. I just tried. That patch does not help for this issue. I see, I didn't touch the failing assert with the patch. Can you just remove the assert and see if that helps for you? I cannot think of how it can be relaxed and still be useful. Removing the failing assert fixes the test case. But I wonder why not just get max_issue correct. I'm testing the attached patch. IMHO, max_issue looks confusing. * The concept of ISSUE POINT has never been used since the code landed in repository. * In the comment just before the function, it's mentioned that MAX_POINTS is the sum of points of all instructions in READY. But it does not match the code. The code only summarizes the points of the first MORE_ISSUE instructions. If later ISSUE_POINTS become not uniform, that piece of code should be redesigned. So I think it's good to remove it now. And top - choice_stack is a good replacement for top-n. So we can remove field n from struct choice_entry, too. Now I'm looking at MIPS target to find out why this change in the would cause PR37360. /* ??? We used to assert here that we never issue more insns than issue_rate. However, some targets (e.g. MIPS/SB1) claim lower issue rate than can be achieved to get better performance. Until these targets are fixed to use scheduler hooks to manipulate insns priority instead, the assert should - be disabled. - - gcc_assert (more_issue = 0); */ + be disabled. */ -- Jie Zhang CodeSourcery * haifa-sched.c (ISSUE_POINTS): Remove. (struct choice_entry): Remove field n. (max_issue): Don't issue more than issue_rate instructions. Index: haifa-sched.c === --- haifa-sched.c (revision 165642) +++ haifa-sched.c (working copy) @@ -199,10 +199,6 @@ struct common_sched_info_def *common_sch /* The minimal value of the INSN_TICK of an instruction. */ #define MIN_TICK (-max_insn_queue_index) -/* Issue points are used to distinguish between instructions in max_issue (). - For now, all instructions are equally good. */ -#define ISSUE_POINTS(INSN) 1 - /* List of important notes we must keep around. This is a pointer to the last element in the list. */ rtx note_list; @@ -2401,8 +2397,6 @@ struct choice_entry int index; /* The number of the rest insns whose issues we should try. */ int rest; - /* The number of issued essential insns. */ - int n; /* State after issuing the insn. */ state_t state; }; @@ -2444,8 +2438,7 @@ static int cached_issue_rate = 0; insns is insns with the best rank (the first insn in READY). To make this function tries different samples of ready insns. READY is current queue `ready'. Global array READY_TRY reflects what - insns are already issued in this try. MAX_POINTS is the sum of points - of all instructions in READY. The function stops immediately, + insns are already issued in this try. The function stops immediately, if it reached the such a solution, that all instruction can be issued. INDEX will contain index of the best insn in READY. The following function is used only for first cycle multipass scheduling. @@ -2458,7 +2451,7 @@ int max_issue (struct ready_list *ready, int privileged_n, state_t state, int *index) { - int n, i, all, n_ready, best, delay, tries_num, max_points; + int i, all, n_ready, best, delay, tries_num; int more_issue; struct choice_entry *top; rtx insn; @@ -2477,25 +2470,15 @@ max_issue (struct ready_list *ready, int } /* Init max_points. */ - max_points = 0; more_issue = issue_rate - cycle_issued_insns; /* ??? We used to assert here that we never issue more insns than issue_rate. However, some targets (e.g. MIPS/SB1) claim lower issue rate than can be achieved to get better performance. Until these targets are fixed to use scheduler hooks to manipulate insns priority instead
Re: Questions about selective scheduler and PowerPC
On 10/19/2010 10:16 PM, Andrey Belevantsev wrote: On 19.10.2010 17:57, Jie Zhang wrote: Removing the failing assert fixes the test case. But I wonder why not just get max_issue correct. I'm testing the attached patch. IMHO, max_issue looks confusing. * The concept of ISSUE POINT has never been used since the code landed in repository. * In the comment just before the function, it's mentioned that MAX_POINTS is the sum of points of all instructions in READY. But it does not match the code. The code only summarizes the points of the first MORE_ISSUE instructions. If later ISSUE_POINTS become not uniform, that piece of code should be redesigned. So I think it's good to remove it now. And top - choice_stack is a good replacement for top-n. So we can remove field n from struct choice_entry, too. Now I'm looking at MIPS target to find out why this change in the would cause PR37360. I agree that ISSUE_POINTS can be removed, as it was not used (maybe Maxim can comment more on this). However, the assert is not about the points but exactly about the situation when a target is lying to the compiler about its issue rate. The ideal situation is that we agree on that this should never happen, but then you need to fix all targets that use this trick, and it seems that there is at least mips, ppc, and x86-64 (which is why I pointed you to 45352). The fix would be to find out why claiming the true issue rate degrades performance and to implement the proper scheduling hooks for changing priority of some insns, or to enable -fsched-pressure for the offending targets. I agree. But I still have a question about TARGET_SCHED_ISSUE_RATE. According to my understanding of gccint: [quote] Target Hook: int TARGET_SCHED_ISSUE_RATE (void) [snip] Although the insn scheduler can define itself the possibility of issue an insn on the same cycle, the value can serve as an additional constraint to issue insns on the same simulated processor cycle [snip] [/quote] it should be allowed to be defined smaller than the issue rate defined by the scheduler DFA. So even if the backend defines a DFA which is capable to issue 4 instructions in one cycle but it also defines TARGET_SCHED_ISSUE_RATE to 3, the scheduler should restrict the number of instructions issued in one cycle to 3 instead of 4. So I think this assert should hold even the backend lies to scheduler about the issue rate. Fixing the lies is another problem. With the attached draft patch, we can enable the assert in max_issue without regression on PR37360. This is a lot of work, which is why this assert was installed in max_issue for relatively short amount of time. Maybe it's time to try again, but let's have a consensus first that this assert should never trigger by design and we have enough flexibility in the scheduler to provide legal means to achieve the same performance effect. Agree. Regards, -- Jie Zhang CodeSourcery diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index b13d648..7653941 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -589,6 +589,10 @@ static const char *mips_hi_relocs[NUM_SYMBOL_TYPES]; /* Target state for MIPS16. */ struct target_globals *mips16_globals; +/* Cached value of can_issue_more. This is cached in mips_variable_issue hook + and returned from mips_sched_reorder2. */ +static int cached_can_issue_more; + /* Index R is the smallest register class that contains register R. */ const enum reg_class mips_regno_to_class[FIRST_PSEUDO_REGISTER] = { LEA_REGS, LEA_REGS, M16_REGS, V1_REG, @@ -12439,8 +12443,8 @@ mips_sched_init (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED, /* Implement TARGET_SCHED_REORDER and TARGET_SCHED_REORDER2. */ static int -mips_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED, - rtx *ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED) +mips_sched_reorder_1 (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED, + rtx *ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED) { if (!reload_completed TUNE_MACC_CHAINS @@ -12455,10 +12459,25 @@ mips_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED, if (TUNE_74K) mips_74k_agen_reorder (ready, *nreadyp); +} + +static int +mips_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED, + rtx *ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED) +{ + mips_sched_reorder_1 (file, verbose, ready, nreadyp, cycle); return mips_issue_rate (); } +static int +mips_sched_reorder2 (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED, + rtx *ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED) +{ + mips_sched_reorder_1 (file, verbose, ready, nreadyp, cycle); + return cached_can_issue_more; +} + /* Update round-robin counters for ALU1/2 and FALU1/2. */ static void @@ -12516,6 +12535,7 @@ mips_variable_issue (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED, || recog_memoized (insn
Questions about selective scheduler and PowerPC
Hi, I'm investigating a GCC testsuite FAIL of PowerPC with e500 multilib. The test is pr42245.c, which sets options to -O2 -fselective-scheduling -fsel-sched-pipelining. $ ./cc1 -quiet pr42245.c -mcpu=8540 -mfloat-gprs=single -O2 -fselective-scheduling pr42245.c: In function ‘build_DIS_CON_tree’: pr42245.c:29:1: internal compiler error: in advance_state_on_fence, at sel-sched.c:5288 The code around sel-sched.c:5288 looks like: 5265 static bool 5266 advance_state_on_fence (fence_t fence, insn_t insn) 5267 { 5268 bool asm_p; 5269 5270 if (recog_memoized (insn) = 0) 5271 { 5272 int res; 5273 state_t temp_state = alloca (dfa_state_size); 5274 5275 gcc_assert (!INSN_ASM_P (insn)); 5276 asm_p = false; 5277 5278 memcpy (temp_state, FENCE_STATE (fence), dfa_state_size); 5279 res = state_transition (FENCE_STATE (fence), insn); 5280 gcc_assert (res 0); 5281 5282 if (memcmp (temp_state, FENCE_STATE (fence), dfa_state_size)) 5283 { 5284 FENCE_ISSUED_INSNS (fence)++; 5285 5286 /* We should never issue more than issue_rate insns. */ 5287 if (FENCE_ISSUED_INSNS (fence) issue_rate) 5288 gcc_unreachable (); 5289 } 5290 } 5291 else When this error happens, FENCE_ISSUED_INSNS (fence) is 2 and issue_rate is 1. PowerPC 8540 is capable to issue 2 instructions in one cycle, but rs6000_issue_rate lies to scheduler that it can only issue 1 instruction before register relocation is done. See the following code: 23205 static int 23206 rs6000_issue_rate (void) 23207 { 23208 /* Unless scheduling for register pressure, use issue rate of 1 for 23209 first scheduling pass to decrease degradation. */ 23210 if (!reload_completed !flag_sched_pressure) 23211 return 1; 23212 23213 switch (rs6000_cpu_attr) { [snip] 23223 case CPU_PPC8540: [snip] 23230 return 2; This issue could be traced down to haifa-sched.c:max_issue (), which returns 2 even issue_rate is 1. So my questions and possible ways to fix it are: 1. Should we restrict max_issue to only return value less than or equal to issue_rate? 2. Should we do the same as what SMS does? See static void sms_schedule (void) { [snip] /* Initialize issue_rate. */ if (targetm.sched.issue_rate) { int temp = reload_completed; reload_completed = 1; issue_rate = targetm.sched.issue_rate (); reload_completed = temp; } else issue_rate = 1; [snip] } I suspect this piece code in sms_schedule was written for rs6000, but it comes as the first commit of SMS merge and there is no patch email explaining it. 3. The aforementioned rs6000 hack rs6000_issue_rate was added by 2003-03-03 David Edelsohn edels...@gnu.org * config/rs6000/rs6000.c (rs6000_multipass_dfa_lookahead): Delete. (TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD): Delete. (rs6000_variable_issue): Do not return negative value. (rs6000_issue_rate): Uniformly set issue rate to 1 for first scheduling pass. , which was more than 7 years ago. Is this still needed now? Any one of the above three ways can fix the FAIL. But I'm not sure which way is best, or maybe we should do 1 and 3 and remove the hack in 2? Thoughts? Regards, -- Jie Zhang CodeSourcery
Re: Questions about selective scheduler and PowerPC
Hi Andrey, On 10/18/2010 03:13 PM, Andrey Belevantsev wrote: Hi Jie, On 18.10.2010 10:49, Jie Zhang wrote: When this error happens, FENCE_ISSUED_INSNS (fence) is 2 and issue_rate is 1. PowerPC 8540 is capable to issue 2 instructions in one cycle, but rs6000_issue_rate lies to scheduler that it can only issue 1 instruction before register relocation is done. See the following code: See PR 45352. I've tried to fix this in the selective scheduler by modeling the lying behavior in line with the haifa scheduler. Let me know if the last patch from the PR audit trail doesn't work for you. In addition, after the above patch goes in, I can make the selective scheduler not try to jump through the hoops with putting correct sched cycles on insns for targets which don't need it in their target_finish hook. I guess powerpc needs this though, but x86-64 (for which PR 45342 was opened) almost surely does not. Thanks for your reply. I just tried. That patch does not help for this issue. -- Jie Zhang CodeSourcery
gcc.dg/graphite/interchange-9.c and small memory target
Hi Sebastian, I currently encountered an issue when testing gcc.dg/graphite/interchange-9.c on a ARM bare-metal board which has only 4MB memory. Apparently, with #define N #define M int A[N*M] in main is too large to fit in stack. There are several ways to solve this issue: 1. Make this test a compile test instead of a run test. 2. Define both M and N to 111. I checked and the test is still valid, ie it still tests what is intended. 3. Use STACK_SIZE macro to calculate M and N. But I don't know how to do that. And I'm not sure if we got a very small M and N, the test will be still valid. Which way do you like most? Regards, -- Jie Zhang CodeSourcery
Re: gcc.dg/graphite/interchange-9.c and small memory target
On 08/11/2010 11:47 PM, Sebastian Pop wrote: On Wed, Aug 11, 2010 at 10:29, Jie Zhangj...@codesourcery.com wrote: Hi Sebastian, I currently encountered an issue when testing gcc.dg/graphite/interchange-9.c on a ARM bare-metal board which has only 4MB memory. Apparently, with #define N #define M int A[N*M] in main is too large to fit in stack. There are several ways to solve this issue: 1. Make this test a compile test instead of a run test. 2. Define both M and N to 111. I checked and the test is still valid, ie it still tests what is intended. 3. Use STACK_SIZE macro to calculate M and N. But I don't know how to do that. And I'm not sure if we got a very small M and N, the test will be still valid. Which way do you like most? I would say, let's go for solution 2. I don't like the first solution as you want to also validate that the transform is correct. As for solution 3, I do not know either how to do that. I will keep in mind these limitations for the future testcases. Thanks. I will submit a patch for solution 2. -- Jie Zhang CodeSourcery
Re: GCC Bugzilla is broken now
On 07/13/2010 11:13 AM, Jie Zhang wrote: I got this when trying to access http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44921 Software error: Can't rename data/versioncache.Xg5KN to versioncache at globals.pl line 306. For help, please send mail to the webmaster (sourcemas...@sourceware.org), giving this error message and the time and date of the error. It has recovered now. Thanks! -- Jie Zhang CodeSourcery
GCC Bugzilla is broken now
I got this when trying to access http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44921 Software error: Can't rename data/versioncache.Xg5KN to versioncache at globals.pl line 306. For help, please send mail to the webmaster (sourcemas...@sourceware.org), giving this error message and the time and date of the error. It was OK just two or three hours ago. -- Jie Zhang CodeSourcery
Re: A question about patch submission
On 07/13/2010 11:56 AM, Mingming Sun wrote: Hi, I want to submit a patch about loongson 3A, a new architecture different from loongson 2E 2F. My patch is based on Gcc4.4.0. If I want to submit my patch, which branch shoud I submit to, gcc4.4.0 branch or I should change the patch to suite with the main branch. Yes. You need port your patch to SVN trunk before submit it. -- Jie Zhang CodeSourcery
Re: A question about patch submission
On 07/13/2010 12:20 PM, Mingming Sun wrote: On Tue, Jul 13, 2010 at 12:17 PM, Jie Zhangj...@codesourcery.com wrote: On 07/13/2010 11:56 AM, Mingming Sun wrote: I want to submit a patch about loongson 3A, a new architecture different from loongson 2E2F. My patch is based on Gcc4.4.0. If I want to submit my patch, which branch shoud I submit to, gcc4.4.0 branch or I should change the patch to suite with the main branch. Yes. You need port your patch to SVN trunk before submit it. Do you mean I must submit it to the main brach? Please don't top reply. (I have moved your reply down.) I'm not sure if I understand you. GCC source code is maintained in SVN repository, which has a trunk and many branches. The trunk is where the main development is going on. gcc4.4.0 is not a branch. It is a release, which is created from branch gcc-4_4-branch. When you submit a patch like this, you need to create your patch against SVN trunk instead of any branch or tag or release tarball. -- Jie Zhang CodeSourcery
Re: complete list of emulated TLS targets.
On Thu, Jul 8, 2010 at 9:28 PM, Bernd Schmidt ber...@codesourcery.com wrote: On 07/06/2010 10:39 PM, IainS wrote: I'd like to compile a complete list of targets affected by changes in emulated TLS. *-*-darwin* hppa64-hp-hpux11.11 cris-*-elf I think also; *-*-mingw *-*-cygwin could people please add to the list/confirm as appropriate? I'm pretty sure bfin* is on the list. Yes. bfin-uclinux and bfin-linux-uclibc. Jie
Question on REG_EQUAL documentation
The GCC internal document says [1]: [quote] In the early stages of register allocation, a REG_EQUAL note is changed into a REG_EQUIV note if op is a constant and the insn represents the only set of its destination register. Thus, compiler passes prior to register allocation need only check for REG_EQUAL notes and passes subsequent to register allocation need only check for REG_EQUIV notes. [/quote] But I still find REG_EQUAL notes in RTL dumps for those passes after IRA. My understanding is: a REG_EQUAL note is changed into a REG_EQUIV note in IRA when possible, but the remaining REG_EQUAL notes are still kept around. So the compiler passes after register allocation need check for both REG_EQUIV notes and REG_EQUAL notes. Is my understanding correct? [1] http://gcc.gnu.org/onlinedocs/gccint/Insns.html#index-REG_005fEQUIV-2258 -- Jie Zhang CodeSourcery (650) 331-3385 x735
GCC viewcvs issue
This URL http://gcc.gnu.org/viewcvs/branches/gcc-4_4-branch/gcc/tree-ssa-alias.c?annotate=155646 which tries to annotate the latest revision of tree-ssa-alias.c on 4.4 branch gives An Exception Has Occurred Python Traceback Traceback (most recent call last): File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 4317, in main request.run_viewvc() File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 397, in run_viewvc self.view_func(self) File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 1769, in view_annotate markup_or_annotate(request, 1) File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 1696, in markup_or_annotate path[-1], mime_type) File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 1589, in markup_stream_pygments encoding='utf-8'), ps) File /usr/lib/python2.3/site-packages/pygments/__init__.py, line 85, in highlight return format(lex(code, lexer), formatter, outfile) File /usr/lib/python2.3/site-packages/pygments/__init__.py, line 68, in format formatter.format(tokens, outfile) File /usr/lib/python2.3/site-packages/pygments/formatter.py, line 92, in format return self.format_unencoded(tokensource, outfile) File /usr/lib/python2.3/site-packages/pygments/formatters/html.py, line 704, in format_unencoded for t, piece in source: File /usr/lib/python2.3/site-packages/pygments/formatters/html.py, line 611, in _format_lines for ttype, value in tokensource: File /usr/lib/python2.3/site-packages/pygments/lexer.py, line 162, in streamer for i, t, v in self.get_tokens_unprocessed(text): File /usr/lib/python2.3/site-packages/pygments/lexers/compiled.py, line 155, in get_tokens_unprocessed for index, token, value in \ File /usr/lib/python2.3/site-packages/pygments/lexer.py, line 479, in get_tokens_unprocessed m = rexmatch(text, pos) RuntimeError: maximum recursion limit exceeded Similar issue for 4.3 branch. trunk, 4.2 and 4.1 are OK. Regards, -- Jie Zhang CodeSourcery (650) 331-3385 x735
Re: gcc- 4.6.0 20100416 rtmutex.c:1138:1: internal compiler error
On 04/19/2010 02:43 PM, Justin P. Mattock wrote: I couldn't resist..(had to play), anyways I looked through the reports but didn't see anything that was familiar. so I went and created an entry: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43791 Thanks. Please add a preprocessed source file so people can reproduce your issue. -- Jie Zhang CodeSourcery (650) 331-3385 x735
Re: gcc- 4.6.0 20100416 rtmutex.c:1138:1: internal compiler error
On 04/19/2010 12:19 PM, Justin Mattock wrote: so far I've compiled most of the system (glibc,Xserver,etc..) and not really anything has crashed and burned except for the kernel: kernel/rtmutex.c: At top level: kernel/rtmutex.c:1138:1: internal compiler error: in cgraph_decide_inlining_of_small_functions, at ipa-inline.c:1009 Please submit a full bug report, with preprocessed source if appropriate. Seehttp://gcc.gnu.org/bugs.html for instructions. make[1]: *** [kernel/rtmutex.o] Error 1 make: *** [kernel] Error 2 any reports of this/ideas? Please report a bug with a preprocessed source file in GCC's bugzilla. -- Jie Zhang CodeSourcery (650) 331-3385 x735
Re: Ask for suggestions on init_caller_save
On 03/30/2010 12:11 AM, Jeff Law wrote: On 03/23/10 21:30, Jie Zhang wrote: I'm fixing a bug. It's caused by uninitialized caller save pass data. One function in the test case uses the optimize attribute with O2 option. So even with -O0 in command line, GCC calls caller save pass for that function. The problem is init_caller_save is called in backend_inti_target if flag_caller_saves is set. Apparently, in this case, flag_caller_saves is not set when came to backend_inti_target. I think there are several ways to fix this bug, but I don't know which way should/can I go: 1. Always call init_caller_save in backend_inti_target. But it seems a waste for most cases if -O0. 2. Call init_caller_save in IRA main function. But by this way it will be called multiple times unless we create a flag to remember if it has been called or not. Maybe we can reuse test_reg or test_mem. If they are NULL_TREE, just call init_caller_save. 3. Call init_caller_save in handle_optimize_attribute. If flag_caller_saves is not set before parse_optimize_options but set after, call init_caller_save. Considering there might be multiple functions using optimize attribute, we also need a flag to remember if init_caller_save has been called or not. 4. There are only three global function in caller-save.c: init_save_areas, setup_save_areas, and save_call_clobbered_regs. We can just add a check in the beginning of those functions. If the data has not been initialized, just init_caller_save first. Any suggestions? I'd suggest #2 with a status flag indicating whether or not caller-saves has been initialized. That should be low enough overhead to not be a problem. Thanks. I will send a patch to gcc-patches and CC you. -- Jie Zhang CodeSourcery (650) 331-3385 x735
Question about gen_rtx_VAR_LOCATION
There are two calls of gen_rtx_VAR_LOCATION in cfgexpand.c. Both calls cast a tree to rtx as the third argument. Why a tree is used in RTL expression? Will it be transformed into RTL later or all RTL passes should recognize it's a tree and just ignore it? Thanks. -- Jie Zhang CodeSourcery (650) 331-3385 x735
Re: Question about gen_rtx_VAR_LOCATION
On 03/26/2010 11:36 PM, Jakub Jelinek wrote: On Fri, Mar 26, 2010 at 11:27:24PM +0800, Jie Zhang wrote: There are two calls of gen_rtx_VAR_LOCATION in cfgexpand.c. Both calls cast a tree to rtx as the third argument. Why a tree is used in RTL expression? Will it be transformed into RTL later or all RTL passes should recognize it's a tree and just ignore it? Thanks. Yes, it is just temporary. The tree survives there just from the calls from within expand_gimple_basic_block until the immediately following expand_debug_locations call. Hmmm, I found a case that gen_rtx_VAR_LOCATION is called but the tree in VAR_LOCATION does not get expanded. It's related to handling of optimize attribute. I will start a new thread for that. Thanks. -- Jie Zhang CodeSourcery (650) 331-3385 x735
Question about RTL code hoisting
I just found that the current RTL code hoisting cannot optimize REG = ... if (cond) { r0 = REG; } else { r0 = REG; ... } to REG = ... r0 = REG; if (cond) { } else { ... } where REG is a pseudo register and r0 is a physical register. I have looked at the code of RTL hoisting pass. But I cannot find a simple way to extend it to deal with this case. And the hoisting pass is only enabled when -Os. So I'm going to implement another hoisting pass to do this optimization. Is it a good idea? Does anyone know if there is an existing pass which should have handled or be able to be easily adapted to handle this case? Thanks! -- Jie Zhang CodeSourcery (650) 331-3385 x735
Re: Question about RTL code hoisting
On 03/25/2010 11:22 PM, Jeff Law wrote: On 03/25/10 09:14, Bernd Schmidt wrote: On 03/25/2010 04:03 PM, Jie Zhang wrote: I just found that the current RTL code hoisting cannot optimize REG = ... if (cond) { r0 = REG; } else { r0 = REG; ... } to REG = ... r0 = REG; if (cond) { } else { ... } where REG is a pseudo register and r0 is a physical register. I have looked at the code of RTL hoisting pass. But I cannot find a simple way to extend it to deal with this case. And the hoisting pass is only enabled when -Os. So I'm going to implement another hoisting pass to do this optimization. Is it a good idea? Does anyone know if there is an existing pass which should have handled or be able to be easily adapted to handle this case? Thanks! Isn't this similar to crossjumping, except done in the other direction? Yes, though the implementation is completely different. Hoisting computes which blocks compute specific expressions, then determines if all paths from a point compute the same expression and if so, moves the multiple computations to a single location. cross jumping works by matching RTL bits at the end of blocks. I never bothered to implement hoisting which touched hard regs -- I never thought the cost/benefit analysis made much sense. It's quite a bit more work to implement and code motion of hard regs is much more restricted than code motion involving just pseudos. Thanks Bernd and Jeff. This case is not common. I'm wondering how likely this kind of optimization pass will be accepted into GCC. Another way to fix it is teaching register allocator to allocate r0 to REG. But I guess it's more difficult. Jie
Re: Question about RTL code hoisting
On 03/25/2010 11:24 PM, Steven Bosscher wrote: On Thu, Mar 25, 2010 at 4:03 PM, Jie Zhangj...@codesourcery.com wrote: I just found that the current RTL code hoisting cannot optimize REG = ... if (cond) { r0 = REG; } else { r0 = REG; ... } to REG = ... r0 = REG; if (cond) { } else { ... } where REG is a pseudo register and r0 is a physical register. I have looked at the code of RTL hoisting pass. But I cannot find a simple way to extend it to deal with this case. Right, there are two issues: * HOIST doesn't handle hard registers * HOIST doesn't hoist reg-reg moves There is no easy way to add cost metrics to hoist reg-reg moves, and handling hard regs is an even bigger problem. What is the original code? I (well, by now: we) have patches in the works for GCC 4.6 that add code hoisting to GIMPLE (see PR23286), perhaps that solves this case for you. I'm not sure if I can make the test case public. I need to ask. I'm afraid gimple cannot help this. r0 is here because it's used for passing argument to callees. This issue is only exposed when expanded to RTL. And the hoisting pass is only enabled when -Os. So I'm going to implement another hoisting pass to do this optimization. Is it a good idea? To add duplicate functionality? No. I'm not going to add duplicate functionality. What I'm going to do is only handle hard-reg = pseudo-reg case. Does anyone know if there is an existing pass which should have handled or be able to be easily adapted to handle this case? Hoisting should handle it, bui Can you open a new PR and make it block PR33828, please? If I can publish the test case, yes. Or I have to rewrite a test case. -- Jie Zhang CodeSourcery (650) 331-3385 x735
Ask for suggestions on init_caller_save
I'm fixing a bug. It's caused by uninitialized caller save pass data. One function in the test case uses the optimize attribute with O2 option. So even with -O0 in command line, GCC calls caller save pass for that function. The problem is init_caller_save is called in backend_inti_target if flag_caller_saves is set. Apparently, in this case, flag_caller_saves is not set when came to backend_inti_target. I think there are several ways to fix this bug, but I don't know which way should/can I go: 1. Always call init_caller_save in backend_inti_target. But it seems a waste for most cases if -O0. 2. Call init_caller_save in IRA main function. But by this way it will be called multiple times unless we create a flag to remember if it has been called or not. Maybe we can reuse test_reg or test_mem. If they are NULL_TREE, just call init_caller_save. 3. Call init_caller_save in handle_optimize_attribute. If flag_caller_saves is not set before parse_optimize_options but set after, call init_caller_save. Considering there might be multiple functions using optimize attribute, we also need a flag to remember if init_caller_save has been called or not. 4. There are only three global function in caller-save.c: init_save_areas, setup_save_areas, and save_call_clobbered_regs. We can just add a check in the beginning of those functions. If the data has not been initialized, just init_caller_save first. Any suggestions? Thanks in advance. -- Jie Zhang CodeSourcery (650) 331-3385 x735
Question about removing multiple elements from VEC
Hi, I'm looking at this FIXME in cp/typeck2.c. /* FIXME: Ordered removal is O(1) so the whole function is worst-case quadratic. This could be fixed using an aside bitmap to record which elements must be removed and remove them all at the same time. Or by merging split_non_constant_init into process_init_constructor_array, that is separating constants from non-constants while building the vector. */ VEC_ordered_remove (constructor_elt, CONSTRUCTOR_ELTS (init), idx); It seems there is no VEC function which can use a bitmap to do a ordered multiple remove. Did I miss something or I have to write one? Regards, -- Jie Zhang CodeSourcery (650) 331-3385 x735
Re: Question about removing multiple elements from VEC
On 03/17/2010 12:08 AM, Richard Guenther wrote: On Tue, Mar 16, 2010 at 5:02 PM, Jie Zhangj...@codesourcery.com wrote: Hi, I'm looking at this FIXME in cp/typeck2.c. /* FIXME: Ordered removal is O(1) so the whole function is worst-case quadratic. This could be fixed using an aside bitmap to record which elements must be removed and remove them all at the same time. Or by merging split_non_constant_init into process_init_constructor_array, that is separating constants from non-constants while building the vector. */ VEC_ordered_remove (constructor_elt, CONSTRUCTOR_ELTS (init), idx); It seems there is no VEC function which can use a bitmap to do a ordered multiple remove. Did I miss something or I have to write one? You have to write one. Thanks! -- Jie Zhang CodeSourcery (650) 331-3385 x735
IRA conflict graph, multiple alternatives and commutative operands
I'm looking at PR 42258. I have a question on IRA conflict graph and multiple alternatives. Below is an RTL insn just before register allocation pass: (insn 7 6 12 2 pr42258.c:2 (set (reg:SI 136) (mult:SI (reg:SI 137) (reg/v:SI 135 [ x ]))) 33 {*thumb_mulsi3}) IRA generates the following conflict graph for r135, r136 and r137: ;; a0(r136,l0) conflicts: a2(r137,l0) a1(r135,l0) ;; total conflict hard regs: ;; conflict hard regs: ;; a1(r135,l0) conflicts: a0(r136,l0) a2(r137,l0) ;; total conflict hard regs: ;; conflict hard regs: ;; a2(r137,l0) conflicts: a0(r136,l0) a1(r135,l0) ;; total conflict hard regs: ;; conflict hard regs: regions=1, blocks=3, points=5 allocnos=3, copies=0, conflicts=0, ranges=3 Apparently this conflict graph is not an optimized one for any of the three alternatives in the following instruction pattern: (define_insn *thumb_mulsi3 [(set (match_operand:SI 0 register_operand =l,l,l) (mult:SI (match_operand:SI 1 register_operand %l,*h,0) (match_operand:SI 2 register_operand l,l,l)))] ...) This conflict graph seems like a merge of conflict graphs of the three alternatives. Ideally for the first and second alternatives, we should have ;; a0(r136,l0) conflicts: a2(r137,l0) a1(r135,l0) ;; a1(r135,l0) conflicts: a0(r136,l0) ;; a2(r137,l0) conflicts: a0(r136,l0) For the third alternative, we'd better have ;; a0(r136,l0) conflicts: a1(r135,l0) ;; a1(r135,l0) conflicts: a0(r136,l0) cp0:a0(r136)-a2(r137)@1000:constraint And register allocator would use one of these more specific conflict graphs for coloring. If we take the commutative operands into count, we have to add the following conflict graph for choosing. ;; a0(r136,l0) conflicts: a2(r137,l0) ;; a2(r137,l0) conflicts: a0(r136,l0) cp0:a0(r136)-a1(r135)@1000:constraint (Actually, this conflict graph will result in an optimal result for the test case in PR 42258.) Now the problem is when and how to choose the alternative for register allocator to calculate the conflict graph? Yes, I have read the thread: http://gcc.gnu.org/ml/gcc/2009-02/msg00215.html This question seems not easy. So is there any practical method to make register allocator pick up the third alternative and do commutation before or during register allocation? Thanks, -- Jie Zhang CodeSourcery (650) 331-3385 x735
Re: No integral promotions when calling library function?
On Fri, Feb 19, 2010 at 12:03 AM, Dave Korn dave.korn.cyg...@googlemail.com wrote: On 18/02/2010 07:17, Jie Zhang wrote: We are trying to add a 16bit integer division library function for bfin port. I just found GCC didn't do integral promotions when calling library function. Is this expected? I wasn't aware of this myself, but it kind-of makes sense given the way that macros such as FUNCTION_ARG and INIT_CUMULATIVE_ARGS don't get passed any type info in the case of libcalls; I'm guessing here, but that would imply to me that all libcalls are effectively using unnamed stdargs-style arg passing. Not sure how to check this theory without extensively reading the source though. I imagine it's done for efficiency, and there should always be libcall functions existing for the precise types you're passing? I'm trying to use an existing division function as a library function. This existing function does unsigned 16bit integral division and expects both arguments have been zero extended to 32bit. It's a little surprise for me that GCC does not do the promotion when calling library function even TARGET_PROMOTE_PROTOTYPES is defined. I can adjust the division function accordingly. But before I do that, I'd like to know if library functions should follow the same convention as the normal ones or not. Jie
No integral promotions when calling library function?
We are trying to add a 16bit integer division library function for bfin port. I just found GCC didn't do integral promotions when calling library function. For example, in function foo, I can assume both arguments are zero extended from unsigned short to unsigned int. extern unsigned short foo (unsigned short, unsigned short); unsigned int a; unsigned int b; unsigned short bar () { return foo ((unsigned short) a, (unsigned short) b); } But with the following code, I can't assume that the high halves of R0 and R1, which are the first two registers for argument passing, are all zeros. unsigned int a; unsigned int b; unsigned short bar () { return (unsigned short) a / (unsigned short) b; } Is this expected? Thanks, Jie
Re: Jie Zhang appointed bfin maintainer
On 02/08/2010 08:53 AM, Gerald Pfeifer wrote: It is my pleasure to announce that, also based on the recommendation of Bernd as an existing maintainer, the steering committee has appointed Jie Zhang maintainer of the bfin port. Thanks for your contributions over the last five(?) years, Jie. Yes. I have worked on Blackfin port for nearly 5 years with Bernd. Please adjust the MAINTAINERS file accordingly, and Happy Hacking! Thanks! I have updated the the MAINTAINERS file with the attached patch. Jie * MAINTAINERS: Add myself as a maintainer for the bfin port. Index: MAINTAINERS === --- MAINTAINERS (revision 156592) +++ MAINTAINERS (working copy) @@ -48,6 +48,7 @@ avr port Anatoly Sokolov ae...@post.ru avr port Eric Weddington eric.wedding...@atmel.com bfin port Bernd Schmidt bernd.schm...@analog.com +bfin port Jie Zhang jie.zh...@analog.com cris port Hans-Peter Nilsson h...@axis.com crx port Pompapathi V Gadad pompapathi.v.ga...@nsc.com fr30 port Nick Clifton ni...@redhat.com @@ -493,7 +494,6 @@ Canqun Yang can...@nudt.edu.cn Joey Ye joey...@intel.com Kenneth Zadeck zad...@naturalbridge.com -Jie Zhang jie.zh...@analog.com Shujing Zhao pearly.z...@oracle.com Jon Ziegler j...@apple.com Roman Zippel zip...@linux-m68k.org
Re: where can find source snapshots of first GCC 4.5.0 ?
On Mon, Jan 4, 2010 at 8:04 PM, Bernd Roesch nospamn...@gmx.de wrote: Hi, Because of this regression, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41311 Problem is in m68k-elf too, but happen not with any older GCC as 4.5.0 i want try out if the first GCC 4.5.0 snapshot have this Problem or not. The first GCC 4.5.0 i compile was in month 08.this have the Bug. But i find on the mirror sites only first snapshots now that are from month 10. So maybe somebody can post me a link to older versions of GCC 4.5.0 I would recommend using GCC git mirror and bisect to locate the source of regression. It's very fast to switch between different revisions. Jie
Re: df_changeable_flags use in combine.c
On 01/05/2010 07:12 AM, Matt wrote: Hi, I'm fixing some compiler errors when configuring with --enable-build-with-cxx, and ran into a curious line of code that may indicate a bug: static unsigned int rest_of_handle_combine (void) { int rebuild_jump_labels_after_combine; df_set_flags (DF_LR_RUN_DCE + DF_DEFER_INSN_RESCAN); // ... } The DF_* values are from the df_changeable_flags enum, whose values are typically used in logical and/or operations for masking purposes. As such, I'm guessing the author may have meant to do: df_set_flags (DF_LR_RUN_DCE DF_DEFER_INSN_RESCAN); I think you meant |. I think + is same as | here. And I didn't see this error when --enable-build-with-cxx for current trunk head. But I see other errors. Jie
Re: MPC required in one week.
On 12/27/2009 03:05 PM, Silvius Rus wrote: On the flip side, it's not necessarily easy to get it to work. On my build system, apt-get doesn't find it. Downloading and installing the .deb manually triggers 3 missing deps. apt-get install libmpc-dev libmpc-dev is already in squeeze and sid if you are using Debian. Jie
Re: MPC required in one week.
On 12/01/2009 06:25 PM, Paolo Bonzini wrote: On 11/30/2009 09:47 PM, Michael Witten wrote: On Mon, Nov 30, 2009 at 12:04 AM, Kaveh R. GHAZIgh...@caip.rutgers.edu wrote: The patch which makes the MPC library a hard requirement for GCC bootstrapping has been approved today. Out of curiosity and ignorance: Why, specifically, is MPC going to be a hard requirement? On the prerequisites page, MPC is currently described with: Having this library will enable additional optimizations on complex numbers. Does that mean that such optimizations are now an important requirement? or is MPC being used for something else? They are a requirement for Fortran, but it's (much) simpler to do them for all front-ends. Actually the bug mentioned in 4.5 release page under MPC is a C bug. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30789 So it seems not only just for Fortran. Jie
Re: Problem while configuring gcc3.2
Hi, Please don't top reply. On 12/28/2009 12:59 PM, Pardis Beikzadeh wrote: Hi, Also 'make bootstrap' doesn't work without running configure, so I'm not sure what the recommended way mentioned in the email below means. The bootstrap in Jim's reply means, I think, building a minimal (only C front-end) gcc-3.2 first using gcc-3.4. Then you can use the minimal gcc-3.2 to build a full gcc-3.2. If you build gcc-3.2 the recommended way, e.g. via bootstrap, then you won't run into this problem. The fortran front end will be built by the bootstrapped gcc-3.2 instead of gcc-3.4, and you won't get this error. If you are building a cross, then you bootstrap a native gcc-3.2 first, and then use the native gcc-3.2 to build the cross gcc-3.2. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com Jie
Re: Question on PR36873
On 12/23/2009 02:43 PM, Jie Zhang wrote: Hi, We just got a similar problem on Blackfin GCC recently. Let me take the test code from the bug as an example: I reduce the test case to a simpler one: $ cat foo.c unsigned int foo (volatile unsigned short *p) { return *p; } I the tree dump foo.c.126t.optimized, GCC refused to eliminate D.1256 because the first statement contains a volatile operand: D.1256 ={v} *p; return (unsigned int) D.1256; I'm not familiar with the trees. Is it possible to replace D.1256 and have something like below? return (unsigned int) {v} *p; I experiment a little. It seems {v} will be lost in SSA name replacing during out of SSA transform. Can anyone pointed me if it's possible to do the replace but still keep {v}? Or I should find another way to do that? Or it's wrong to do this optimization? Thanks, Jie
Re: Question on PR36873
On 12/23/2009 06:12 PM, Dave Korn wrote: Jie Zhang wrote: typedef unsigned short u16; typedef unsigned int u32; u32 a(volatile u16* off) { return *off; } mingw32-gcc-4.3.0.exe -c -O2 -fomit-frame-pointer -mtune=core2 test.c it produces: _a: 0: 8b 44 24 04 mov0x4(%esp),%eax 4: 0f b7 00movzwl (%eax),%eax 7: 0f b7 c0movzwl %ax,%eax== The redundant insn a: c3 ret How does it look at the RTL level? I wonder if this situation is similar to the one being discussed in the other current thread Which optimizer should remove redundant subreg of sign_extension? With my native GCC on Debian AMD64 unstable, in t.c.128r.expand: (insn 6 5 7 3 t.c:5 (set (reg:HI 58 [ D.1595 ]) (mem/v:HI (reg/v/f:DI 60 [ off ]) [2 S2 A16])) -1 (nil)) (insn 7 6 8 3 t.c:5 (set (reg:SI 61) (zero_extend:SI (reg:HI 58 [ D.1595 ]))) -1 (nil)) In t.c.201r.shorten: (insn:TI 6 3 7 t.c:5 (set (reg:HI 0 ax [orig:58 D.1595 ] [58]) (mem/v:HI (reg/v/f:DI 5 di [orig:60 off ] [60]) [2 S2 A16])) 53 {*movhi_1} (expr_list:REG_DEAD (reg/v/f:DI 5 di [orig:60 off ] [60]) (nil))) (insn:TI 7 6 18 t.c:5 (set (reg:SI 0 ax [orig:61 D.1595 ] [61]) (zero_extend:SI (reg:HI 0 ax [orig:58 D.1595 ] [58]))) 114 {*zero_extendhisi2_movzwl} (nil)) There is a volatile flag for mem operand. If there is no such flag, I think one of RTL passes might combine them. It looks similar with the issue in the thread you mentioned. But the cause is different. Regards, Jie
Question on PR36873
Hi, We just got a similar problem on Blackfin GCC recently. Let me take the test code from the bug as an example: typedef unsigned short u16; typedef unsigned int u32; u32 a(volatile u16* off) { return *off; } u32 b(u16* off) { return *off; } compiled with mingw32-gcc-4.3.0.exe -c -O2 -fomit-frame-pointer -mtune=core2 test.c it produces: _a: 0: 8b 44 24 04 mov0x4(%esp),%eax 4: 0f b7 00movzwl (%eax),%eax 7: 0f b7 c0movzwl %ax,%eax == The redundant insn a: c3 ret 0010 _b: 10: 8b 44 24 04 mov0x4(%esp),%eax 14: 0f b7 00movzwl (%eax),%eax 17: c3 ret I don't understand Richard's comment. What do we not optimize volatile accesses in this test case. I know we cannot do many optimizations on volatile accesses, but I think it's OK to remove the redundant insn in this case. Could someone provide me a case in which we cannot remove it. Thanks, Jie
Re: GMP and GCC 4.3.2
On 12/18/2009 06:27 AM, Jean Christophe Beyler wrote: Actually, I just finished updating my 4.3.2 to 4.3.3 and tested it and I still have the same issue. This seems to be a problem more than just 4.3.2. Here is the test program: #includestdio.h #includegmp.h int main() { mpz_t a,b; mpz_init_set_str(a, 100, 10); // program works with 10^9, but not // with 10^10 (10^20 2^64) mpz_init_set(b, a); mpz_mul(a, a, a); gmp_printf(first, in GMP mpz_mul(a,a,a) with a=%Zd gives %Zd \n, b, a); mpz_set(b, a); mpz_mul(a, a, a); gmp_printf(second, in GMP mpz_mul(a,a,a) with a=%Zd gives %Zd \n, b, a); return 0; } We obtain: first, in GMP mpz_mul(a,a,a) with a=100 gives 1 second, in GMP mpz_mul(a,a,a) with a=1 gives 2254536013160540992915637663717291581095979492475463214517286840718852096 Which clearly is wrong for the second output. This was tested with a 64 bit architecture. I know that with a 4.1.1 port of the compiler, I do not see this issue. I will see if I can port it forward to see if I still see the problem but it might be difficult to port from 4.3.2 to 4.4.2, I'm not sure how many things have changed but I'm sure quite a bit ! This is the -v of my GCC, version 4.3.3: Using built-in specs. Target: myarch64-linux-elf Configured with: /home/beyler/myarch64/src/myarch64-gcc-4.3.2/configure --target=myarch64-linux-elf --with-headers=/home/beyler/myarch64/src/newlib-1.16.0/newlib/libc/include --prefix=/home/beyler/myarch64/local --disable-nls --enable-languages=c --with-newlib --disable-libssp --with-mpfr=/home/beyler/myarch64/local Thread model: single gcc version 4.3.3 (GCC) What's myarch64? I got the correct result for your test with vanilla gcc-4.3.2 and gmp-4.3.1 on Debian unstable AMD64. Jie
Re: No .got section in ELF
On 11/26/2009 02:04 PM, yunfeng zhang wrote: The result is the same #includestdio.h extern int g __attribute__((visibility(hidden))); int g; int foo(int a, int b) { g = a + b; printf(%x, %x,g, foo); return g; } load and call `foo' in the library, an outputting (with vdso) is cc15bc, cc03fc and open f.map 0x15bc, 0x3fc It shows Linux simply maps the library to memory *using* library segment layout. Using e.cc to call it #includeexception #includetypeinfo #includecstddef #includedlfcn.h #includestdio.h int main(void) { void* handle = dlopen(./f.so, RTLD_NOW); typedef int (*gso)(int, int); gso f; *(void**) (f) = dlsym(handle, foo); f(1, 2); return 0; } You got the bad test case. Please try the following: $ cat f.c #include stdio.h int g; int foo(int a, int b) { g = a + b; printf(g = 0x%x, foo = 0x%x\n, g, foo); return g; } $ cat e.c int g; extern int foo(int a, int b); int main(void) { foo(1, 2); return 0; } $ gcc -shared -fPIC -Wl,-soname,./libf.so,-Map,f.map -o libf.so f.c $ gcc -o e e.c -ldl -L. -lf $ ./e g = 0x600a30, foo = 0x294a2614 Then comment out the int g; in e.c. and do the same steps as above: $ ./e g = 0x58294948, foo = 0x58094614 You can see that C-A is *not* a constant. Your premise is wrong. Jie
Re: Truncated history in viewvc
Dave Korn wrote: Andrew Pinski wrote: On Wed, Sep 16, 2009 at 12:19 AM, Dave Korn dave.korn.cyg...@googlemail.com wrote: Good morning all! Is there some reason that I don't know about (e.g. limiting the load on the server) why the revision log views of files in our viewvc setup would be heavily truncated? The issue comes down to the trunk had be accidently deleted and viewvc does not handle that. -- Pinski Ohhh, yes, I remember that happening. Oh well, don't think there's much we can do about that, except perhaps hope a future version can deal with it. Might be worth our while filing an enhancement request at the viewvc project, but that's it really. Thanks for the explanation. I have a gcc svn repository mirror on my hard disk. The svn history of varasm.c in viewvc looks good. The oldest entry is Revision 281 - (view) (download) (as text) (annotate) - [select for diffs] Added Thu Feb 6 00:04:16 1992 UTC (17 years, 7 months ago) by rms File length: 77149 byte(s) Initial revision I'm using ViewVC 1.0.5-0.2 from Debian AMD64 unstable. I noticed that the ViewVC version is 1.1.2 on gcc.gnu.org. Maybe it's a regression of ViewVC. Jie
Re: Stuck master branch in git mirror
Andreas Schwab wrote: It looks like the master branch of git://gcc.gnu.org/git/gcc hasn't been updated since 3 weeks (trunk is still ok). Same here. I now use trunk instead. Jie
Re: libmudflap and emutls question
Jakub Jelinek wrote: On Wed, Jan 07, 2009 at 11:38:55AM +0100, Paolo Bonzini wrote: Which version of gcc did you use? gcc 4.1 (maybe and 4.2) will report error. But gcc 4.3 compiles OK. I tested using x86_64 native gcc from Debian unstable. __emutls_get_address is defined in libgcc even the target has real TLS. Uff... not my day. I used 4.2 (emutls was posted in 4.2 time but committed in 4.3 only). But I didn't think of the simplest solution: use greps together with strings(1): strings ./conftest | grep __emutls_get_address. I'd say much better would be just to grep assembly. See e.g. libffi/configure.ac's libffi_cv_hidden_visibility_attribute test. This is what I was looking for. Thanks! The updated patch is attached. Is it OK now? Jie libmudflap/ * mf-impl.h (__mf_get_state, __mf_set_state): Don't use __thread when TLS support is emulated. * mf-hooks3.c (__mf_get_state, __mf_set_state): Likewise. * mf-runtime.c (__mf_state_1): Likewise. * configure.ac: Use GCC_CHECK_EMUTLS. * configure: Regenerate. * config.h.in: Regenerate. config/ * tls.m4 (GCC_CHECK_EMUTLS): Define. Index: libmudflap/mf-impl.h === --- libmudflap/mf-impl.h (revision 143076) +++ libmudflap/mf-impl.h (working copy) @@ -244,7 +244,7 @@ extern pthread_mutex_t __mf_biglock; #define UNLOCKTH() do {} while (0) #endif -#if defined(LIBMUDFLAPTH) !defined(HAVE_TLS) +#if defined(LIBMUDFLAPTH) (!defined(HAVE_TLS) || defined(USE_EMUTLS)) extern enum __mf_state_enum __mf_get_state (void); extern void __mf_set_state (enum __mf_state_enum); #else Index: libmudflap/mf-hooks3.c === --- libmudflap/mf-hooks3.c (revision 143076) +++ libmudflap/mf-hooks3.c (working copy) @@ -78,7 +78,7 @@ DECLARE(int, pthread_create, pthread_t * /* Multithreading support hooks. */ -#ifndef HAVE_TLS +#if !defined(HAVE_TLS) || defined(USE_EMUTLS) /* We don't have TLS. Ordinarily we could use pthread keys, but since we're commandeering malloc/free that presents a few problems. The first is that we'll recurse from __mf_get_state to pthread_setspecific to malloc back to @@ -217,7 +217,7 @@ __mf_pthread_cleanup (void *arg) if (__mf_opts.heur_std_data) __mf_unregister (errno, sizeof (errno), __MF_TYPE_GUESS); -#ifndef HAVE_TLS +#if !defined(HAVE_TLS) || defined(USE_EMUTLS) struct mf_thread_data *data = __mf_find_threadinfo (0); if (data) data-used_p = 0; Index: libmudflap/configure.ac === --- libmudflap/configure.ac (revision 143076) +++ libmudflap/configure.ac (working copy) @@ -265,6 +265,7 @@ fi # See if we support thread-local storage. GCC_CHECK_TLS +GCC_CHECK_EMUTLS AC_CONFIG_FILES([Makefile testsuite/Makefile testsuite/mfconfig.exp]) AC_OUTPUT Index: libmudflap/mf-runtime.c === --- libmudflap/mf-runtime.c (revision 143076) +++ libmudflap/mf-runtime.c (working copy) @@ -178,7 +178,7 @@ struct __mf_options __mf_opts; int __mf_starting_p = 1; #ifdef LIBMUDFLAPTH -#ifdef HAVE_TLS +#if defined(HAVE_TLS) !defined(USE_EMUTLS) __thread enum __mf_state_enum __mf_state_1 = reentrant; #endif #else Index: config/tls.m4 === --- config/tls.m4 (revision 143076) +++ config/tls.m4 (working copy) @@ -86,3 +86,21 @@ AC_DEFUN([GCC_CHECK_CC_TLS], [ AC_DEFINE(HAVE_CC_TLS, 1, [Define to 1 if the target assembler supports thread-local storage.]) fi]) + +dnl Check whether TLS is emulated. +AC_DEFUN([GCC_CHECK_EMUTLS], [ + AC_CACHE_CHECK([whether the thread-local storage support is from emutls], + gcc_cv_use_emutls, [ +gcc_cv_use_emutls=no +echo '__thread int a; int b; int main() { return a = b; }' conftest.c +if AC_TRY_COMMAND(${CC-cc} -Werror -S -o conftest.s conftest.c 1AS_MESSAGE_LOG_FD); then + if grep __emutls_get_address conftest.s /dev/null; then + gcc_cv_use_emutls=yes + fi +fi +rm -f conftest.* +]) + if test $gcc_cv_use_emutls = yes ; then +AC_DEFINE(USE_EMUTLS, 1, + [Define to 1 if the target use emutls for thread-local storage.]) + fi])
Re: libmudflap and emutls question
Hi Paolo, Thanks for your review! Paolo Bonzini wrote: +AC_COMPILE_IFELSE([__thread int a; int b; int main() { return a = b; }], + [if grep __emutls_get_address conftest.$ac_objext /dev/null ; then grepping in a binary file is not portable. If this works it would be better: AC_COMPILE_IFELSE([[__thread int a; int b; extern void __emutls_get_address(); int main() { __emutls_get_address(); return a = b; }]], [gcc_cv_use_emutls=yes], [gcc_cv_use_emutls=no]) This does not work. For x86_64 native gcc, the compiler output is $ gcc -c test.c test.c:2: warning: conflicting types for built-in function ‘__emutls_get_address’ For Blackfin gcc, the compiler output is $ bfin-uclinux-gcc -c test.c test.c:2: warning: conflicting types for built-in function ‘__emutls_get_address’ Both are same. I thought about using int __emutls_v.a; to trigger duplicate definitions, but C don't allow dot in symbol name. If there is something like AC_COMPILE_IFELSE but output assembly file instead of object file, it will be the best choice. But I don't know if it exists. There is an existing practice in autoconf (c.m4), which greps object file to find out endianness. So I think grep object file might be acceptable. Otherwise, the configury parts look fine to me. Regards, Jie
Re: libmudflap and emutls question
Hi Frank, Frank Ch. Eigler wrote: Jie Zhang jie.zh...@analog.com writes: To break the recursive loop, one solution is to force emutls to call the real calloc. [...] If it were acceptable to change emutls on account of mudflap, this sort of thing could work. Other alternatives would include having emutls define something in addition to HAVE_TLS that activates the !HAVE_TLS implementation in libmudflap/mf-hooks3.c. Thanks for your help! How about the attached patch, which follows your advice? Regards, Jie libmudflap/ * mf-impl.h (__mf_get_state, __mf_set_state): Don't use __thread when TLS support is emulated. * mf-hooks3.c (__mf_get_state, __mf_set_state): Likewise. * mf-runtime.c (__mf_state_1): Likewise. * configure.ac: Use GCC_CHECK_EMUTLS. * configure: Regenerate. * config.h.in: Regenerate. config/ * tls.m4 (GCC_CHECK_EMUTLS): Define. Index: libmudflap/mf-impl.h === --- libmudflap/mf-impl.h (revision 143074) +++ libmudflap/mf-impl.h (working copy) @@ -244,7 +244,7 @@ #define UNLOCKTH() do {} while (0) #endif -#if defined(LIBMUDFLAPTH) !defined(HAVE_TLS) +#if defined(LIBMUDFLAPTH) (!defined(HAVE_TLS) || defined(USE_EMUTLS)) extern enum __mf_state_enum __mf_get_state (void); extern void __mf_set_state (enum __mf_state_enum); #else Index: libmudflap/mf-hooks3.c === --- libmudflap/mf-hooks3.c (revision 143074) +++ libmudflap/mf-hooks3.c (working copy) @@ -78,7 +78,7 @@ /* Multithreading support hooks. */ -#ifndef HAVE_TLS +#if !defined(HAVE_TLS) || defined(USE_EMUTLS) /* We don't have TLS. Ordinarily we could use pthread keys, but since we're commandeering malloc/free that presents a few problems. The first is that we'll recurse from __mf_get_state to pthread_setspecific to malloc back to @@ -217,7 +217,7 @@ if (__mf_opts.heur_std_data) __mf_unregister (errno, sizeof (errno), __MF_TYPE_GUESS); -#ifndef HAVE_TLS +#if !defined(HAVE_TLS) || defined(USE_EMUTLS) struct mf_thread_data *data = __mf_find_threadinfo (0); if (data) data-used_p = 0; Index: libmudflap/configure.ac === --- libmudflap/configure.ac (revision 143074) +++ libmudflap/configure.ac (working copy) @@ -265,6 +265,7 @@ # See if we support thread-local storage. GCC_CHECK_TLS +GCC_CHECK_EMUTLS AC_CONFIG_FILES([Makefile testsuite/Makefile testsuite/mfconfig.exp]) AC_OUTPUT Index: libmudflap/mf-runtime.c === --- libmudflap/mf-runtime.c (revision 143074) +++ libmudflap/mf-runtime.c (working copy) @@ -178,7 +178,7 @@ int __mf_starting_p = 1; #ifdef LIBMUDFLAPTH -#ifdef HAVE_TLS +#if defined(HAVE_TLS) !defined(USE_EMUTLS) __thread enum __mf_state_enum __mf_state_1 = reentrant; #endif #else Index: config/tls.m4 === --- config/tls.m4 (revision 143074) +++ config/tls.m4 (working copy) @@ -86,3 +86,20 @@ AC_DEFINE(HAVE_CC_TLS, 1, [Define to 1 if the target assembler supports thread-local storage.]) fi]) + +dnl Check whether TLS is emulated. +AC_DEFUN([GCC_CHECK_EMUTLS], [ + AC_CACHE_CHECK([whether the thread-local storage support is from emutls], + gcc_cv_use_emutls, [ +AC_COMPILE_IFELSE([__thread int a; int b; int main() { return a = b; }], + [if grep __emutls_get_address conftest.$ac_objext /dev/null ; then + gcc_cv_use_emutls=yes + else + gcc_cv_use_emutls=no + fi + ], [gcc_cv_use_emutls=no])] +) + if test $gcc_cv_use_emutls = yes ; then +AC_DEFINE(USE_EMUTLS, 1, + [Define to 1 if the target use emutls for thread-local storage.]) + fi])
libmudflap and emutls question
Hi, I encountered a recursive call problem between libmudflap and emutls when testing libmudflap for Blackfin. But I think this issue affects all targets without TLS. One libmudflap test case in the testsuite calls __wrap_calloc. In __wrap_calloc, __mf_state_1 is looked by __mf_get_state to see if it's in_malloc, reentrant or active. With emutls, HAVE_TLS is defined as 1 now. So __mf_state_1 has type of __thread. When emults tries to simulate TLS for __mf_state_1, it recursively calls __wrap_calloc in __emutls_get_address. To break the recursive loop, one solution is to force emutls to call the real calloc. But I don't know how to do this. Could someone help me on this? Thanks! Jie
Re: Set environment variable on remote target
Andreas Schwab wrote: Jie Zhang [EMAIL PROTECTED] writes: So we have to use single quotes. The updated patch is attached. This will break if the value can contain single quotes. How about using double quotes but escaping , \, $, and ` using backslash? The patch is attached. Jie diff --git a/lib/rsh.exp b/lib/rsh.exp index 1a207a8..d846887 100644 --- a/lib/rsh.exp +++ b/lib/rsh.exp @@ -225,6 +225,7 @@ proc rsh_upload {desthost srcfile destfile} { # proc rsh_exec { boardname program pargs inp outp } { global timeout +global remote_env verbose Executing $boardname:$program $pargs $inp @@ -261,7 +262,14 @@ proc rsh_exec { boardname program pargs inp outp } { set inp /dev/null } -set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] +set remote_envs +foreach envvar [array names remote_env] { + set tmp_env $remote_env($envvar) + # Escape , \, $, and `, which cannot be protected by double quotes. + regsub -all (\[\$`]) $tmp_env \\1 tmp_env + set remote_envs $remote_envs $envvar=\$tmp_env\ +} +set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$remote_envs $program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] set status [lindex $ret 0] set output [lindex $ret 1] diff --git a/lib/utils.exp b/lib/utils.exp index 6c9ff98..6325dd8 100644 --- a/lib/utils.exp +++ b/lib/utils.exp @@ -414,3 +414,12 @@ proc getenv { var } { } } +# +# Set an environment variable remotely +# +proc remote_setenv { var val } { +global remote_env + +set remote_env($var) $val +} +
Re: Set environment variable on remote target
Andreas Schwab wrote: Jie Zhang [EMAIL PROTECTED] writes: Andreas Schwab wrote: Jie Zhang [EMAIL PROTECTED] writes: @@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } { set inp /dev/null } -set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] +set remote_envs +foreach envvar [array names remote_env] { + set remote_envs $remote_envs $envvar=$remote_env($envvar) That needs to do proper quoting to protect shell meta characters. Thanks for pointing out this. A new patch is attached. Is the quoting right? That won't protect all meta characters. Inside double quotes the dollar sign, backslash and backquote are still special. So we have to use single quotes. The updated patch is attached. Thanks, Jie diff --git a/lib/rsh.exp b/lib/rsh.exp index 1a207a8..94122e8 100644 --- a/lib/rsh.exp +++ b/lib/rsh.exp @@ -225,6 +225,7 @@ proc rsh_upload {desthost srcfile destfile} { # proc rsh_exec { boardname program pargs inp outp } { global timeout +global remote_env verbose Executing $boardname:$program $pargs $inp @@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } { set inp /dev/null } -set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] +set remote_envs +foreach envvar [array names remote_env] { + set remote_envs $remote_envs $envvar='$remote_env($envvar)' +} +set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$remote_envs $program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] set status [lindex $ret 0] set output [lindex $ret 1] diff --git a/lib/utils.exp b/lib/utils.exp index 6c9ff98..6325dd8 100644 --- a/lib/utils.exp +++ b/lib/utils.exp @@ -414,3 +414,12 @@ proc getenv { var } { } } +# +# Set an environment variable remotely +# +proc remote_setenv { var val } { +global remote_env + +set remote_env($var) $val +} +
Set environment variable on remote target
libmudflap tests set a environment MUDFLAP_OPTIONS=-viol-segv before testing such that violations are promoted to SIGSEGV signals in testing. Otherwise, the exit value would be 0 even the test has violations. libmudflap testsuite depends on the exit value of tests to decide if the test PASS or FAIL. Setting MUDFLAP_OPTIONS is done in DejaGNU by setenv MUDFLAP_OPTIONS -viol-segv which works fine on native testing. But when doing remote cross testing, setenv does not help. I cannot find existing mechanism in DejaGNU. So I want to use a global array like remote_env. If remote cross testing, add the environment variable in this array. Then set the environment variables according to the array when remote execute test case. I wrote a draft patch show what I means, which is attached. In mudflap testsuite, replace each setenv with if { ![is_remote target] } { setenv MUDFLAP_OPTIONS -viol-segv } else { remote_setenv MUDFLAP_OPTIONS -viol-segv } Is it the right way to do this, or is there existing method I can use but I missed? Thanks, Jie diff --git a/lib/rsh.exp b/lib/rsh.exp index 1a207a8..df3b3d1 100644 --- a/lib/rsh.exp +++ b/lib/rsh.exp @@ -225,6 +225,7 @@ proc rsh_upload {desthost srcfile destfile} { # proc rsh_exec { boardname program pargs inp outp } { global timeout +global remote_env verbose Executing $boardname:$program $pargs $inp @@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } { set inp /dev/null } -set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] +set remote_envs +foreach envvar [array names remote_env] { + set remote_envs $remote_envs $envvar=$remote_env($envvar) +} +set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$remote_envs $program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] set status [lindex $ret 0] set output [lindex $ret 1] diff --git a/lib/utils.exp b/lib/utils.exp index 6c9ff98..8523973 100644 --- a/lib/utils.exp +++ b/lib/utils.exp @@ -414,3 +414,33 @@ proc getenv { var } { } } +# +# Set an environment variable remotely +# +proc remote_setenv { var val } { +global remote_env + +set remote_env($var) $val +} + +# +# Unset an environment variable remotely +# +proc remote_unsetenv { var } { +global remote_env +unset remote_env($var) +} + +# +# Get a value from an environment variable remotely +# +proc remote_getenv { var } { +global remote_env + +if {[info exists remote_env($var)]} { + return $remote_env($var) +} else { + return +} +} +
Re: Set environment variable on remote target
Andreas Schwab wrote: Jie Zhang [EMAIL PROTECTED] writes: @@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } { set inp /dev/null } -set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] +set remote_envs +foreach envvar [array names remote_env] { + set remote_envs $remote_envs $envvar=$remote_env($envvar) That needs to do proper quoting to protect shell meta characters. Thanks for pointing out this. A new patch is attached. Is the quoting right? I also dropped remote_getenv from the new patch, since I just realized it cannot get remote environment variable indeed. remote_unsetenv was dropped for the same reason. The patch for gcc is also attached for review. Jie diff --git a/lib/rsh.exp b/lib/rsh.exp index 1a207a8..13be541 100644 --- a/lib/rsh.exp +++ b/lib/rsh.exp @@ -225,6 +225,7 @@ proc rsh_upload {desthost srcfile destfile} { # proc rsh_exec { boardname program pargs inp outp } { global timeout +global remote_env verbose Executing $boardname:$program $pargs $inp @@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } { set inp /dev/null } -set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] +set remote_envs +foreach envvar [array names remote_env] { + set remote_envs $remote_envs $envvar=\$remote_env($envvar)\ +} +set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$remote_envs $program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout] set status [lindex $ret 0] set output [lindex $ret 1] diff --git a/lib/utils.exp b/lib/utils.exp index 6c9ff98..6325dd8 100644 --- a/lib/utils.exp +++ b/lib/utils.exp @@ -414,3 +414,12 @@ proc getenv { var } { } } +# +# Set an environment variable remotely +# +proc remote_setenv { var val } { +global remote_env + +set remote_env($var) $val +} + Index: testsuite/libmudflap.c/cfrags.exp === --- testsuite/libmudflap.c/cfrags.exp (revision 136236) +++ testsuite/libmudflap.c/cfrags.exp (working copy) @@ -13,7 +13,11 @@ ${srcdir}/libmudflap.c/hook*.c \ ${srcdir}/libmudflap.c/pass*.c]] { set bsrc [file tail $srcfile] - setenv MUDFLAP_OPTIONS -viol-segv + if { ![is_remote target] } { + setenv MUDFLAP_OPTIONS -viol-segv + } else { + remote_setenv MUDFLAP_OPTIONS -viol-segv + } dg-runtest $srcfile $flags -fmudflap -lmudflap } } Index: testsuite/libmudflap.c/externs.exp === --- testsuite/libmudflap.c/externs.exp (revision 136236) +++ testsuite/libmudflap.c/externs.exp (working copy) @@ -23,7 +23,11 @@ set test externs-21 linkage ${flags} if [string match $l3] { pass $test } { fail $test } -setenv MUDFLAP_OPTIONS -viol-segv +if { ![is_remote target] } { + setenv MUDFLAP_OPTIONS -viol-segv +} else { + remote_setenv MUDFLAP_OPTIONS -viol-segv +} remote_spawn host ./externs-12.exe set l5 [remote_wait host 10] Index: testsuite/libmudflap.c++/ctors.exp === --- testsuite/libmudflap.c++/ctors.exp (revision 136236) +++ testsuite/libmudflap.c++/ctors.exp (working copy) @@ -28,7 +28,11 @@ set test ctors-21 linkage ${flags} if [string match $l3] { pass $test } { fail $test } -setenv MUDFLAP_OPTIONS -viol-segv +if { ![is_remote target] } { + setenv MUDFLAP_OPTIONS -viol-segv +} else { + remote_setenv MUDFLAP_OPTIONS -viol-segv +} remote_spawn host ./ctors-12.exe set l5 [remote_wait host 10] Index: testsuite/libmudflap.c++/c++frags.exp === --- testsuite/libmudflap.c++/c++frags.exp (revision 136236) +++ testsuite/libmudflap.c++/c++frags.exp (working copy) @@ -14,7 +14,11 @@ foreach flags $MUDFLAP_FLAGS { foreach srcfile [lsort [glob -nocomplain ${srcdir}/libmudflap.c++/*frag.cxx]] { set bsrc [file tail $srcfile] - setenv MUDFLAP_OPTIONS -viol-segv + if { ![is_remote target] } { + setenv MUDFLAP_OPTIONS -viol-segv + } else { + remote_setenv MUDFLAP_OPTIONS -viol-segv + } dg-runtest $srcfile $flags -fmudflap -lmudflap } } Index: testsuite/libmudflap.cth/cthfrags.exp === --- testsuite/libmudflap.cth/cthfrags.exp (revision 136236) +++ testsuite/libmudflap.cth/cthfrags.exp (working copy) @@ -9,7 +9,11 @@ foreach flags $MUDFLAP_FLAGS { foreach srcfile [lsort [glob -nocomplain ${srcdir}/libmudflap.cth/*.c]] { set bsrc [file tail $srcfile] - setenv MUDFLAP_OPTIONS -viol-segv + if { ![is_remote target] } { + setenv MUDFLAP_OPTIONS -viol-segv + } else { + remote_setenv MUDFLAP_OPTIONS -viol-segv
Link tests after GCC_NO_EXECUTABLES
libstdc++ tries to avoid link tests when configured with newlib. But I saw this when working on bfin port gcc: checking whether -lc should be explicitly linked in... no checking dynamic linker characteristics... no checking how to hardcode library paths into programs... immediate checking for shl_load... configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES. make[1]: *** [configure-target-libstdc++-v3] Error 1 make[1]: Leaving directory `/home/jie/blackfin-sources/build43/gcc_build-4.3' make: *** [all] Error 2 I got this when building bfin-elf-gcc with patched gcc and newlib in the same tree. I found LT_SYS_DLOPEN_SELF does link tests for shl_load after GCC_NO_EXECUTABLES. The call path is libstdc++-v3/configure.ac AM_PROG_LIBTOOL - libtool.m4 LT_INIT - _LT_SETUP - _LT_LANG_C_CONFIG - LT_SYS_DLOPEN_SELF How about the patch below, which uses LT_SYS_DLOPEN_SELF only when not cross compiling. Jie * libtool.m4 (_LT_LANG_C_CONFIG): Only use LT_SYS_DLOPEN_SELF when not cross compiling. Index: libtool.m4 === --- libtool.m4 (revision 128569) +++ libtool.m4 (working copy) @@ -5117,7 +5117,9 @@ _LT_LINKER_SHLIBS($1) _LT_SYS_DYNAMIC_LINKER($1) _LT_LINKER_HARDCODE_LIBPATH($1) - LT_SYS_DLOPEN_SELF + if test $cross_compiling = no; then +LT_SYS_DLOPEN_SELF + fi _LT_CMD_STRIPLIB # Report which library types will actually be built
Re: Link tests after GCC_NO_EXECUTABLES
Bernd Schmidt wrote: Jie Zhang wrote: libstdc++ tries to avoid link tests when configured with newlib. But I saw this when working on bfin port gcc: checking whether -lc should be explicitly linked in... no checking dynamic linker characteristics... no checking how to hardcode library paths into programs... immediate checking for shl_load... configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES. make[1]: *** [configure-target-libstdc++-v3] Error 1 make[1]: Leaving directory `/home/jie/blackfin-sources/build43/gcc_build-4.3' make: *** [all] Error 2 I got this when building bfin-elf-gcc with patched gcc and newlib in the same tree. I found LT_SYS_DLOPEN_SELF does link tests for shl_load after GCC_NO_EXECUTABLES. The call path is libstdc++-v3/configure.ac AM_PROG_LIBTOOL - libtool.m4 LT_INIT - _LT_SETUP - _LT_LANG_C_CONFIG - LT_SYS_DLOPEN_SELF I saw something similar, but managed to make it go away. I don't remember how exactly (this kind of issue seems to happen to me all the time, for different reasons each time), but I think the actual problem was that you need to ensure that gcc_no_link doesn't get set. That's a test somewhat earlier in configure. But by design if gcc_no_link = no, link tests should be avoided. Jie
Re: Link tests after GCC_NO_EXECUTABLES
Bernd Schmidt wrote: Jie Zhang wrote: But by design if gcc_no_link = no, link tests should be avoided. ??? I would have thought gcc_no_link = yes means link tests are avoided. Oops, I meant gcc_no_link = yes. Jie
Re: Link tests after GCC_NO_EXECUTABLES
Rask Ingemann Lambertsen wrote: On Tue, Sep 18, 2007 at 07:55:45PM +0800, Jie Zhang wrote: libstdc++ tries to avoid link tests when configured with newlib. But I saw this when working on bfin port gcc: From config.log: /home/rask/build/gcc-bfin-unknown-elf/gcc/../ld/ld-new: cannot open linker script file bf532.ld: No such file or directory $ grep -F -e bf532.ld gcc/config/bfin/* gcc/config/bfin/elf.h:%{!T*:%{!msim:%{mcpu=bf531:-Tbf531.ld}%{mcpu=bf532:-Tbf532.ld} \ gcc/config/bfin/elf.h:%{!mcpu=*:-Tbf532.ld}}} The file bf532.ld is nowhere to be found in gcc or newlib/libgloss. I have not pushed out our recent newlib/libgloss changes to upstream yet. Currently you could get latest blackfin port newlib/libgloss from http://blackfin.uclinux.org/gf/project/toolchain/scmsvn But if it cannot find bf532.ld, it should avoid further link tests. Jie
Re: Link tests after GCC_NO_EXECUTABLES
Daniel Jacobowitz wrote: On Tue, Sep 18, 2007 at 03:27:18PM +0200, Bernd Schmidt wrote: Jie Zhang wrote: Bernd Schmidt wrote: Jie Zhang wrote: But by design if gcc_no_link = no, link tests should be avoided. ??? I would have thought gcc_no_link = yes means link tests are avoided. Oops, I meant gcc_no_link = yes. Stupid double negatives. Okay, so then your problem is that gcc_no_link=yes. Find out why it's setting that. It always does for newlib. The libtool tests are relatively recent (from some recent autotools upgrade). Yes, It was added by http://sourceware.org/ml/binutils/2007-05/msg00247.html Jie
Re: Link tests after GCC_NO_EXECUTABLES
Bernd Schmidt wrote: Jie Zhang wrote: Bernd Schmidt wrote: Jie Zhang wrote: But by design if gcc_no_link = no, link tests should be avoided. ??? I would have thought gcc_no_link = yes means link tests are avoided. Oops, I meant gcc_no_link = yes. Stupid double negatives. Okay, so then your problem is that gcc_no_link=yes. Find out why it's setting that. bfin-elf-gcc -mfdpic failed to link a simple test case because code is put into L1 instruction sram and data is put into L1 data sram, but Blackfin immediate offset load instruction cannot access GOT since the gap between instruction sram and data sram is too large. Using -msim as default will pass this test case and build gcc without problem but I would like bfin-elf-gcc target hardware board by default. Use -fPIC as default is not good, since -fpic is enough for any real applications. So I would like to avoid link test for shl_load when GCC_NO_EXECUTABLES. Jie
Re: Link tests after GCC_NO_EXECUTABLES
Bernd Schmidt wrote: Jie Zhang wrote: bfin-elf-gcc -mfdpic failed to link a simple test case because code is put into L1 instruction sram and data is put into L1 data sram, but Blackfin immediate offset load instruction cannot access GOT since the gap between instruction sram and data sram is too large. Using -msim as default will pass this test case and build gcc without problem but I would like bfin-elf-gcc target hardware board by default. Any chance we could target it in such a way as to not put everything in L1 by default? I think it's stupid to have configure tests failing for such a reason. But then we need add sdram initialization code into crt files, which is usually provided by applications when needed. Jie
Re: Link tests after GCC_NO_EXECUTABLES
Rask Ingemann Lambertsen wrote: /home/rask/build/gcc-bfin-unknown-elf/gcc/../ld/ld-new: crt532.o: No such file: No such file or directory I sorted that out by using your config/bfin/elf.h, but there's something weird. The first time configure runs, it will complain about GCC_NO_EXECUTABLES but there's no (obvious) clue as to why in config.log. If I run make again, it begins to build libstdc++ but fails with this: Making all in libsupc++ make[4]: Entering directory `/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/libsupc++' /bin/sh ../libtool --tag CXX --tag disable-shared --mode=compile /home/rask/build/gcc-bfin-unknown-elf/./gcc/xgcc -shared-libgcc -B/home/rask/build/gcc-bfin-unknown-elf/./gcc -nostdinc++ -L/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/src -L/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/src/.libs -nostdinc -B/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/newlib/ -isystem /home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/newlib/targ-include -isystem /n/12/rask/src/all/newlib/libc/include -B/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libgloss/bfin -L/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libgloss/libnosys -L/n/12/rask/src/all/libgloss/bfin -B/usr/local/bfin-unknown-elf/bin/ -B/usr/local/bfin-unknown-elf/lib/ -isystem /usr/local/bfin-unknown-elf/include -isystem /usr/local/bfin-unknown-elf/sys-include -L/home/rask/build/gcc-bfin-unknown-elf/./ld -I/n/12/rask/src/all/libstdc++-v3/../gcc -I/home/r ask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/include/bfin-unknown-elf -I/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/include -I/n/12/rask/src/all/libstdc++-v3/libsupc++ -fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual -fdiagnostics-show-location=once -ffunction-sections -fdata-sections -g -O2-c -o array_type_info.lo /n/12/rask/src/all/libstdc++-v3/libsupc++/array_type_info.cc /bin/sh: ../libtool: No such file or directory make[4]: *** [array_type_info.lo] Error 127 make[4]: Leaving directory `/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/libsupc++' make[3]: *** [all-recursive] Error 1 make[3]: Leaving directory `/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3' make[2]: *** [all] Error 2 make[2]: Leaving directory `/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3' make[1]: *** [all-target-libstdc++-v3] Error 2 make[1]: Leaving directory `/home/rask/build/gcc-bfin-unknown-elf' make: *** [all] Error 2 I don't know why this happens to bfin and not to the other newlib targets. I guess it might be caused by different multilib settings in our local (not FSF) newlib and FSF gcc. I have committed a patch in FSF gcc which makes FSF gcc use the same multilib setting as our local gcc. Sorry about that. Please try again. Jie
Re: GCC 4.3.0 Status Report (2007-08-09)
Jie Zhang wrote: On 8/10/07, Mark Mitchell [EMAIL PROTECTED] wrote: Are there any folks out there who have projects for Stage 1 or Stage 2 that they are having trouble getting reviewed? Any comments re. timing for Stage 3? I have many bfin port patches which have not been merged into upstream. I hope I can pushed them out by the end of the next week. I have sent out all my patches (11). 3 of them have been reviewed and committed. Others are being reviewed. I have no access to computer this weekend. I'll be back next Monday or Tuesday. Jie
Re: Division by zero
On 2/10/07, Robert Dewar [EMAIL PROTECTED] wrote: Ian Lance Taylor wrote: Jie Zhang [EMAIL PROTECTED] writes: But now gcc seems to optimize it away. For the following function: $ cat t.c #include limits.h void foo (int rc) { int x = rc / INT_MAX; x = 4 / x; } I believe we still keep division by zero in general. In your example it gets optimized away because it is dead code. Nothing uses x. And it is certainly reasonable to do this optimization given that the result of the division is undefined in C. In Ada, such a division has well defined semantics (raise an exception), but it is interesting to note that the optimization is valid in Ada as well, since there is a special rule that basically says you don't need to evaluate an expression if the only reason for doing so is to see if it raises a predefined exception. That rule is precisely to deal with cases like this. The code I posted in my first email is from libgloss/libnosys/_exit.c. It's used to cause an exception deliberately. From your replies, it seems it should find another way to do that. Thanks, Jie
Re: Division by zero
On 2/10/07, Robert Dewar [EMAIL PROTECTED] wrote: Jie Zhang wrote: The code I posted in my first email is from libgloss/libnosys/_exit.c. It's used to cause an exception deliberately. From your replies, it seems it should find another way to do that. Any code that tries to raise an exception deliberately is certainly depending on undefined behavior, so it has to be very careful about how it is written! I'm going to use an asm (). Jie
Re: Division by zero
On 2/10/07, Steven Bosscher [EMAIL PROTECTED] wrote: On 2/10/07, Jie Zhang [EMAIL PROTECTED] wrote: The code I posted in my first email is from libgloss/libnosys/_exit.c. It's used to cause an exception deliberately. From your replies, it seems it should find another way to do that. Maybe you can use __builtin_trap() ? The exception generated by __builtin_trap () might have been used by stack limit checking. Reusing it in _exit () might confuse the exception handler I think. Jie
Re: Division by zero
On 2/11/07, Paolo Bonzini [EMAIL PROTECTED] wrote: I'm going to use an asm (). Yeah, an asm volatile ( : : r (x) : ) should please GCC and still be portable to different platforms. I thought using an asm () for each port to cause an exception specific for that port. Such that divide-by-zero exception can be distinguished from termination exception. However, asm volatile ( : : r (x)) is good for ports that not providing their specific asms. I'll mention it in the email to newlib mailing list. Thanks, Jie
Division by zero
Hi, Division by zero is undefined. We chose to keep it: http://gcc.gnu.org/ml/gcc-patches/2001-06/msg01068.html But now gcc seems to optimize it away. For the following function: $ cat t.c #include limits.h void foo (int rc) { int x = rc / INT_MAX; x = 4 / x; } $ gcc -O2 -S t.c $ cat t.s .file t.c .text .p2align 4,,15 .globl foo .type foo, @function foo: pushl %ebp movl%esp, %ebp popl%ebp ret .size foo, .-foo .ident GCC: (GNU) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) .section.note.GNU-stack,,@progbits Does we now choose to optimize it away now? Jie
-fvtable-gc
It should has been removed from c.opt in the patch: http://gcc.gnu.org/ml/gcc-patches/2003-07/msg02660.html. But it's still in trunk and branches 3.4/4.0/4.1/4.2. Jie
Re: apply for the relevant forms
On 6/5/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Our Co. have a new 32b embedded processor, and we have ported the gcc backend for it(support c/c++), now we want add its backend code into gcc packages. i read the Contributing to GCC pages that we must sign some forms, can you kindly send the forms to me? It said on http://gcc.gnu.org/contribute.html: It's a good idea to send [EMAIL PROTECTED] a copy of your request., not gcc mailing list. And [EMAIL PROTECTED] and gcc@gcc.gnu.org are same thing, no need to send to both. Jie
Re: Object size checking builtin test case and uClibc
Jie Zhang wrote: Hi, In gcc.c-torture/execute/builtins/lib/chk.c, vsnprintf () is defined using vsprintf (). While vsnprintf () in uClibc is defined using ^ Sorry, should be vsprintf vsnprintf (). When testing on uClinux with uClibc, pr23484-chk.c failed because these two functions will call each other recursively and finally overflow the stack. How can this problem be fixed, In the test case or in uClibc? Jie
Re: Object size checking builtin test case and uClibc
On 3/18/06, Jie Zhang [EMAIL PROTECTED] wrote: Jie Zhang wrote: Hi, In gcc.c-torture/execute/builtins/lib/chk.c, vsnprintf () is defined using vsprintf (). While vsnprintf () in uClibc is defined using ^ Sorry, should be vsprintf vsnprintf (). When testing on uClinux with uClibc, pr23484-chk.c failed because these two functions will call each other recursively and finally overflow the stack. How can this problem be fixed, In the test case or in uClibc? I removed snprintf () and vsnprintf () from gcc.c-torture/execute/builtins/lib/chk.c. All the test cases in builtins.exp pass for bfin port gcc 4.1 on uClinux. Can we remove these two non-_chk versions from chk.c and use the ones from C libraries? Thanks, Jie
Help needed on libgcc.a
Hi, I'm adding some assembly floating point functions to bfin port. These functions are much faster than those in fp-bit.c. However, they relax some IEEE floating point standard rules for checking inputs against NaN. So I think we'd better to call them only when -ffast-math or -ffinite-math-only is added. What I want is tell gcc to link specific libgcc-fast-fp.a, which contains these assembly functions if there is -ffast-match or -ffinite-math-only. Otherwise, let gcc link ordinary libgcc.a. Is there any existing target doing this or similar things? Thanks, Jie
Which program can I use to see VCG dumping from GCC
Hi, In this page http://gcc.gnu.org/news/egcs-vcg.html, it's said that If you view these files using a suitable program, you'll get output similar to the following. However, when I use xvcg to view test.c.01.sibling.vcg, xvcg errors: Wait.aLine 5: attribute T_Co_hidden currently not implemented ! ...aLine 406: attribute T_Co_hidden currently not implemented ! .eSegmentation fault I'm using latest Ubuntu Dapper. gcc version is gcc (GCC) 4.0.3 20060115 (prerelease) (Ubuntu 4.0.2-7ubuntu1). test.c is just the example used in the above HTML page. Which other program should I use to view the VCG dump? Thanks, Jie
Re: Which program can I use to see VCG dumping from GCC
On 1/26/06, Jie Zhang [EMAIL PROTECTED] wrote: Hi, In this page http://gcc.gnu.org/news/egcs-vcg.html, it's said that If you view these files using a suitable program, you'll get output similar to the following. However, when I use xvcg to view test.c.01.sibling.vcg, xvcg errors: Wait.aLine 5: attribute T_Co_hidden currently not implemented ! ...aLine 406: attribute T_Co_hidden currently not implemented ! .eSegmentation fault I'm using latest Ubuntu Dapper. gcc version is gcc (GCC) 4.0.3 20060115 (prerelease) (Ubuntu 4.0.2-7ubuntu1). test.c is just the example used in the above HTML page. Which other program should I use to view the VCG dump? Oops! It seems a bug of xvcg in Ubuntu. I built one from the source package and it works well. Jie