[BFIN] PR target/49862

2012-03-08 Thread Jie Zhang
Hi,

I have committed this patch on trunk for PR target/49862.

Regards,
Jie

	PR target/49862
	* config/bfin/bfin.c (hwloop_optimize): Fix unused variable
	warnings.
	(hwloop_pattern_reg): Fix set but not used warning.
	(bfin_reorg_loops): Remove unused parameter.
	(bfin_reorg): Update use of bfin_reorg_loops.


Index: config/bfin/bfin.c
===
--- config/bfin/bfin.c	(revision 185124)
+++ config/bfin/bfin.c	(working copy)
@@ -3411,14 +3411,12 @@ static bool
 hwloop_optimize (hwloop_info loop)
 {
   basic_block bb;
-  hwloop_info inner;
   rtx insn, last_insn;
   rtx loop_init, start_label, end_label;
   rtx iter_reg, scratchreg, scratch_init, scratch_init_insn;
   rtx lc_reg, lt_reg, lb_reg;
   rtx seq, seq_end;
   int length;
-  unsigned ix;
   bool clobber0, clobber1;
 
   if (loop-depth  MAX_LOOP_DEPTH)
@@ -3840,12 +3838,11 @@ hwloop_fail (hwloop_info loop)
 static rtx
 hwloop_pattern_reg (rtx insn)
 {
-  rtx pat, reg;
+  rtx reg;
 
   if (!JUMP_P (insn) || recog_memoized (insn) != CODE_FOR_loop_end)
 return NULL_RTX;
 
-  pat = PATTERN (insn);
   reg = SET_DEST (XVECEXP (PATTERN (insn), 0, 1));
   if (!REG_P (reg))
 return NULL_RTX;
@@ -3864,7 +3861,7 @@ static struct hw_doloop_hooks bfin_doloo
hardware loops are generated.  */
 
 static void
-bfin_reorg_loops (FILE *dump_file)
+bfin_reorg_loops (void)
 {
   reorg_loops (true, bfin_doloop_hooks);
 }
@@ -4601,7 +4598,7 @@ bfin_reorg (void)
 
   /* Doloop optimization */
   if (cfun-machine-has_hardware_loops)
-bfin_reorg_loops (dump_file);
+bfin_reorg_loops ();
 
   workaround_speculation ();
 


Re: [BFIN] Hookize PREFERRED_RELOAD_CLASS

2012-01-06 Thread Jie Zhang

On 01/06/2012 12:07 PM, Anatoly Sokolov wrote:

Hi, Jie.

On Jan 6, 2012, Jie Zhangjzhang...@gmail.com  wrote:


Hi Anatoly,



The patch looks OK.



But I cannot apply your patch by saving your email as a patch file. If
you take a look at this:



I attach the patch.


I can apply the attached patch. OK. Thank you.


Jie



Re: [BFIN] Hookize PREFERRED_RELOAD_CLASS

2012-01-05 Thread Jie Zhang
Hi Anatoly,

The patch looks OK.

But I cannot apply your patch by saving your email as a patch file. If
you take a look at this:

http://gcc.gnu.org/cgi-bin/get-raw-msg?listname=gcc-patchesdate=2012-01msgid=4F05F12F.607%40post.ru

you will find that there is a extra white space before each context
line. But these extra white spaces do not show up in

http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00262.html

while the starting white space of the last line of the patch is missing.

Regards,
Jie


On Thu, Jan 5, 2012 at 1:51 PM, Anatoly Sokolov ae...@post.ru wrote:
  Hi.

  This patch removes obsolete PREFERRED_RELOAD_CLASS macro from the BFIN back
 end in the GCC and introduces equivalent TARGET_PREFERRED_RELOAD_CLASS
 target hook.

  Compiled. Untested.

  OK to install?

        * config/bfin/bfin.h (PREFERRED_RELOAD_CLASS): Remove.
        * config/bfin/bfin.c (TARGET_PREFERRED_RELOAD_CLASS): Define.
        (bfin_preferred_reload_class): New function.


 Index: gcc/config/bfin/bfin.c
 ===
 --- gcc/config/bfin/bfin.c      (revision 182912)
 +++ gcc/config/bfin/bfin.c      (working copy)
 @@ -2648,6 +2648,19 @@ split_load_immediate (rtx operands[])
   return 0;
  }

 +/* Worker function for TARGET_PREFERRED_RELOAD_CLASS.  */
 +
 +static reg_class_t
 +bfin_preferred_reload_class (rtx x, reg_class_t rclass)
 +{
 +  if (GET_CODE (x) == POST_INC
 +      || GET_CODE (x) == POST_DEC
 +      || GET_CODE (x) == PRE_DEC)
 +    return PREGS;
 +
 +  return rclass;
 +}
 +
  /* Return true if the legitimate memory address for a memory operand of
 mode
    MODE.  Return false if not.  */

 @@ -5771,6 +5784,9 @@ bfin_conditional_register_usage (void)
  #undef TARGET_RETURN_IN_MEMORY
  #define TARGET_RETURN_IN_MEMORY bfin_return_in_memory

 +#undef  TARGET_PREFERRED_RELOAD_CLASS
 +#define TARGET_PREFERRED_RELOAD_CLASS bfin_preferred_reload_class
 +
  #undef TARGET_LEGITIMATE_ADDRESS_P
  #define TARGET_LEGITIMATE_ADDRESS_P    bfin_legitimate_address_p

 Index: gcc/config/bfin/bfin.h
 ===
 --- gcc/config/bfin/bfin.h      (revision 182912)
 +++ gcc/config/bfin/bfin.h      (working copy)
 @@ -707,16 +707,6 @@ enum reg_class
        GET_MODE_SIZE (MODE1) = UNITS_PER_WORD       \
        GET_MODE_SIZE (MODE2) = UNITS_PER_WORD))

 -/* `PREFERRED_RELOAD_CLASS (X, CLASS)'
 -   A C expression that places additional restrictions on the register
 -   class to use when it is necessary to copy value X into a register
 -   in class CLASS.  The value is a register class; perhaps CLASS, or
 -   perhaps another, smaller class.  */
 -#define PREFERRED_RELOAD_CLASS(X, CLASS)               \
 -  (GET_CODE (X) == POST_INC                            \
 -   || GET_CODE (X) == POST_DEC                         \
 -   || GET_CODE (X) == PRE_DEC ? PREGS : (CLASS))
 -
  /* Function Calling Conventions. */

  /* The type of the current function; normal functions are of type


 --
 Anatoly.


Re: [BFIN] Hookize REGISTER_MOVE_COST and MEMORY_MOVE_COST

2011-12-23 Thread Jie Zhang
Hi Anatoly,

I cannot apply your patch to a lean tree. I tried to save your email
as a text file, copy from thunderbird, copy from gmail, copy from the
mailing list archive. But neither works.

Regards,
Jie

2011/12/23 Anatoly Sokolov ae...@post.ru:
  Hi.

  This patch removes obsolete REGISTER_MOVE_COST and MEMORY_MOVE_COST
 macros from the Blackfin back end in the GCC and introduces equivalent
 TARGET_REGISTER_MOVE_COST and TARGET_MEMORY_MOVE_COST target hooks.

  Untested.

  OK to install?

        * config/bfin/bfin.h (REGISTER_MOVE_COST, MEMORY_MOVE_COST): Remove.
        * config/bfin/bfin-protos.h (bfin_register_move_cost,
        bfin_memory_move_cost): Remove.
        * config/bfin/bfin.c (bfin_register_move_cost,
        bfin_memory_move_cost): Make static. Change arguments type from
        enum reg_class to reg_class_t and from int to bool.
        (TARGET_REGISTER_MOVE_COST, TARGET_MEMORY_MOVE_COST): Define.

 Index: gcc/config/bfin/bfin-protos.h
 ===
 --- gcc/config/bfin/bfin-protos.h       (revision 182658)
 +++ gcc/config/bfin/bfin-protos.h       (working copy)
 @@ -85,9 +85,6 @@ extern bool bfin_longcall_p (rtx, int);
  extern bool bfin_dsp_memref_p (rtx);
  extern bool bfin_expand_movmem (rtx, rtx, rtx, rtx);

 -extern int bfin_register_move_cost (enum machine_mode, enum reg_class,
 -                                   enum reg_class);
 -extern int bfin_memory_move_cost (enum machine_mode, enum reg_class, int in);
  extern enum reg_class secondary_input_reload_class (enum reg_class,
                                                    enum machine_mode,
                                                    rtx);
 Index: gcc/config/bfin/bfin.c
 ===
 --- gcc/config/bfin/bfin.c      (revision 182658)
 +++ gcc/config/bfin/bfin.c      (working copy)
 @@ -2149,12 +2149,11 @@ bfin_vector_mode_supported_p (enum machi
   return mode == V2HImode;
  }

 -/* Return the cost of moving data from a register in class CLASS1 to
 -   one in class CLASS2.  A cost of 2 is the default.  */
 +/* Worker function for TARGET_REGISTER_MOVE_COST.  */

 -int
 +static int
  bfin_register_move_cost (enum machine_mode mode,
 -                        enum reg_class class1, enum reg_class class2)
 +                        reg_class_t class1, reg_class_t class2)
  {
   /* These need secondary reloads, so they're more expensive.  */
   if ((class1 == CCREGS  !reg_class_subset_p (class2, DREGS))
 @@ -2177,18 +2176,16 @@ bfin_register_move_cost (enum machine_mo
   return 2;
  }

 -/* Return the cost of moving data of mode M between a
 -   register and memory.  A value of 2 is the default; this cost is
 -   relative to those in `REGISTER_MOVE_COST'.
 +/* Worker function for TARGET_MEMORY_MOVE_COST.

    ??? In theory L1 memory has single-cycle latency.  We should add a switch
    that tells the compiler whether we expect to use only L1 memory for the
    program; it'll make the costs more accurate.  */

 -int
 +static int
  bfin_memory_move_cost (enum machine_mode mode ATTRIBUTE_UNUSED,
 -                      enum reg_class rclass,
 -                      int in ATTRIBUTE_UNUSED)
 +                      reg_class_t rclass,
 +                      bool in ATTRIBUTE_UNUSED)
  {
   /* Make memory accesses slightly more expensive than any register-register
      move.  Also, penalize non-DP registers, since they need secondary
 @@ -5703,6 +5700,12 @@ bfin_conditional_register_usage (void)
  #undef  TARGET_ADDRESS_COST
  #define TARGET_ADDRESS_COST bfin_address_cost

 +#undef TARGET_REGISTER_MOVE_COST
 +#define TARGET_REGISTER_MOVE_COST bfin_register_move_cost
 +
 +#undef TARGET_MEMORY_MOVE_COST
 +#define TARGET_MEMORY_MOVE_COST bfin_memory_move_cost
 +
  #undef  TARGET_ASM_INTEGER
  #define TARGET_ASM_INTEGER bfin_assemble_integer

 Index: gcc/config/bfin/bfin.h
 ===
 --- gcc/config/bfin/bfin.h      (revision 182658)
 +++ gcc/config/bfin/bfin.h      (working copy)
 @@ -975,29 +975,6 @@ typedef struct {
  /* Do not put function addr into constant pool */
  #define NO_FUNCTION_CSE 1

 -/* A C expression for the cost of moving data from a register in class FROM 
 to
 -   one in class TO.  The classes are expressed using the enumeration values
 -   such as `GENERAL_REGS'.  A value of 2 is the default; other values are
 -   interpreted relative to that.
 -
 -   It is not required that the cost always equal 2 when FROM is the same as 
 TO;
 -   on some machines it is expensive to move between registers if they are not
 -   general registers.  */
 -
 -#define REGISTER_MOVE_COST(MODE, CLASS1, CLASS2) \
 -   bfin_register_move_cost ((MODE), (CLASS1), (CLASS2))
 -
 -/* A C expression for the cost of moving data of mode M between a
 -   register and memory.  A value of 2 is the default; this cost is
 -   relative to those in `REGISTER_MOVE_COST'.
 -
 -  

Re: [RFC] Cleanup DW_CFA_GNU_args_size handling

2011-12-20 Thread Jie Zhang
Hi,

On Tue, Aug 2, 2011 at 6:32 PM, Richard Henderson r...@redhat.com wrote:
 I got Jeff Law to review the reload change on IRC
 and committed the composite patch.

 Tested on x86_64, i586, avr, and h8300.  Most other
 tier1 targets ought not be affected, as this patch
 only applies to ACCUMULATE_OUTGOING_ARGS == 0 targets.

This commit may have caused

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51552


Regards,
Jie


Re: Ping: viewvc: python: RuntimeError: maximum recursion limit exceeded

2011-09-04 Thread Jie Zhang
On Sun, Sep 4, 2011 at 3:07 PM, Georg-Johann Lay a...@gjlay.de wrote:
 Hi, I'm getting the following error in viewvc for several days now:

 http://gcc.gnu.org/viewcvs/trunk/gcc/dse.c?view=markup

 An Exception Has Occurred
 Python Traceback

 RuntimeError: maximum recursion limit exceeded

I reported a similar issue one year ago, but no one was interested to fix it.

http://gcc.gnu.org/ml/gcc/2010-04/msg00943.html

So I just did a rsync of GCC SVN repository and installed a ViewVC on
my pc. It works fine.


Jie


Re: __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ or does it?

2011-08-06 Thread Jie Zhang
On Sat, Aug 6, 2011 at 5:40 PM, Christopher Huang-Leaver
zeong...@googlemail.com wrote:
 Output:

 small end first
 big end first

 gcc -v
 gcc version 4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5)

I got the same result with g++-4.4 (4.4.6), g++-4.5 (4.5.3) on Debian
testing. But with g++-4.6, I got

small end first

on my x86_64-linux-gnu machine. I think it's a bug, but it has been
fixed in g++-4.6.

Regards,
Jie


Re: __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ or does it?

2011-08-06 Thread Jie Zhang
On Sat, Aug 6, 2011 at 9:35 PM, Jonathan Wakely jwakely@gmail.com wrote:
 On 6 August 2011 22:40, Christopher Huang-Leaver wrote:
 Hello,

 This isn't really a compiler bug, but it's something which the manual
 doesn't describe too well so I thought I would point this out.

 This page of the manual:
 http://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html#Common-Predefined-Macros

 That documentation refers to the latest sources in GCC trunk, not to GCC 4.4

Ha, so it's not a bug. It's a new feature, which doesn't exist before 4.6.

Jie


Update my email address

2011-04-21 Thread Jie Zhang

Hi,

I have committed this patch to update my email address.

Jie
2011-04-21  Jie Zhang  jzhang...@gmail.com

	* MAINTAINERS: Update my email address.

Index: MAINTAINERS
===
--- MAINTAINERS	(revision 172853)
+++ MAINTAINERS	(working copy)
@@ -49,7 +49,7 @@
 avr port		Anatoly Sokolov		ae...@post.ru
 avr port		Eric Weddington		eric.wedding...@atmel.com
 bfin port		Bernd Schmidt		ber...@codesourcery.com
-bfin port		Jie Zhang		j...@codesourcery.com
+bfin port		Jie Zhang		jzhang...@gmail.com
 cris port		Hans-Peter Nilsson	h...@axis.com
 fr30 port		Nick Clifton		ni...@redhat.com
 frv port		Nick Clifton		ni...@redhat.com


Re: [PATCH] use build_function_type_list in the bfin backend

2011-04-21 Thread Jie Zhang

On 04/20/2011 03:24 PM, Nathan Froyd wrote:

As $SUBJECT suggests.  Tested with cross to bfin-elf.  OK to commit?


OK. Thanks!

Jie


-Nathan

* config/bfin/bfin.c (bfin_init_builtins): Call
build_function_type_list instead of build_function_type.

diff --git a/gcc/config/bfin/bfin.c b/gcc/config/bfin/bfin.c
index 5d08437..03a833d 100644
--- a/gcc/config/bfin/bfin.c
+++ b/gcc/config/bfin/bfin.c
@@ -5967,7 +5967,7 @@ bfin_init_builtins (void)
  {
tree V2HI_type_node = build_vector_type_for_mode (intHI_type_node, 
V2HImode);
tree void_ftype_void
-= build_function_type (void_type_node, void_list_node);
+= build_function_type_list (void_type_node, NULL_TREE);
tree short_ftype_short
  = build_function_type_list (short_integer_type_node, 
short_integer_type_node,
NULL_TREE);




Re: [ARM] [3/3] Implement TARGET_BUILTIN_DECL

2011-04-21 Thread Jie Zhang

Thank you for review, update and commit this patch set!

Jie

On 04/18/2011 10:04 AM, Richard Earnshaw wrote:


On Mon, 2010-10-11 at 15:44 +0800, Jie Zhang wrote:

This patch implements TARGET_BUILTIN_DECL for ARM. With the changes of
the previous two patches, this one is straightforward.

Is it OK?



Sorry for the long time reviewing this set of patches.  I've just
tweaked it to bring it up to the current code base and committed it.
It's largely unchanged from your submission apart from:

1) Updates to incorporate latest changes made by Richard Sandiford.
2) Minor tweak to simplyfy the iWMMXT builtins initialization.

R.

2011-04-18  Jie Zhangj...@codesourcery.com
Richard Earnshawrearn...@arm.com

* arm.c (neon_builtin_type_bits): Remove.
(typedef enum neon_builtin_mode): New.
(T_MAX): Don't define.
(typedef enum neon_builtin_datum): Remove bits, codes[],
num_vars and base_fcode.  Add mode, code and fcode.
(VAR1, VAR2, VAR3, VAR4, VAR5, VAR6, VAR7, VAR8, VAR9
VAR10): Change accordingly.
(neon_builtin_data[]): Change accordingly
(arm_init_neon_builtins): Change accordingly.
(neon_builtin_compare): Remove.
(locate_neon_builtin_icode): Remove.
(arm_expand_neon_builtin): Change accordingly.

* arm.h (enum arm_builtins): Move to ...
* arm.c (enum arm_builtins): ... here; and rearrange builtin code.

* arm.c (arm_builtin_decl): Declare.
(TARGET_BUILTIN_DECL): Define.
(enum arm_builtins): Correct ARM_BUILTIN_MAX.
(arm_builtin_decls[]): New.
(arm_init_neon_builtins): Store builtin declarations in
arm_builtin_decls[].
(arm_init_tls_builtins): Likewise.
(arm_init_iwmmxt_builtins): Likewise.  Refactor initialization code.
(arm_builtin_decl): New.





Re: Find a new maintainer for option handling?

2011-02-21 Thread Jie Zhang
Hi,

Any news about this?

Regards,
Jie

On Tue, Jan 25, 2011 at 2:34 AM, Joseph S. Myers
jos...@codesourcery.com wrote:
 On Mon, 17 Jan 2011, Gerald Pfeifer wrote:

 On Wed, 12 Jan 2011, Jie Zhang wrote:
  I agree. I think Joseph is the best candidate for the maintainer of the
  option handling since he made the most changes of gcc/opts-common.c. He
  is already the maintainer of the driver. If we unify these two
  maintainerships, we save one line of MAINTAINERS. :-)

 I am not so much concerned about that one line in MAINTAINERS, more
 finding someone who is willing to take on the role.  I, too, think
 Joseph would be a great candidate, but it's his call whether he wants
 to. ;-)  (I'll be happy to raise it on the SC in case.)

 I am willing to be considered for option handling maintainership or
 reviewership.

 --
 Joseph S. Myers
 jos...@codesourcery.com



Re: Find a new maintainer for option handling?

2011-02-21 Thread Jie Zhang
Sorry, I just noticed that Joseph has been listed as the maintainer of 
option handling.


Jie

On 02/21/2011 11:56 PM, Jie Zhang wrote:

Hi,

Any news about this?

Regards,
Jie

On Tue, Jan 25, 2011 at 2:34 AM, Joseph S. Myers
jos...@codesourcery.com  wrote:

On Mon, 17 Jan 2011, Gerald Pfeifer wrote:


On Wed, 12 Jan 2011, Jie Zhang wrote:

I agree. I think Joseph is the best candidate for the maintainer of the
option handling since he made the most changes of gcc/opts-common.c. He
is already the maintainer of the driver. If we unify these two
maintainerships, we save one line of MAINTAINERS. :-)


I am not so much concerned about that one line in MAINTAINERS, more
finding someone who is willing to take on the role.  I, too, think
Joseph would be a great candidate, but it's his call whether he wants
to. ;-)  (I'll be happy to raise it on the SC in case.)


I am willing to be considered for option handling maintainership or
reviewership.

--
Joseph S. Myers
jos...@codesourcery.com




--
Jie Zhang



Re: Find a new maintainer for option handling?

2011-01-16 Thread Jie Zhang

Dear Steering Committee:

Is unifying driver and option handling maintainership a good idea?

On 01/12/2011 06:14 PM, Jie Zhang wrote:

On 01/12/2011 06:07 PM, Richard Guenther wrote:

On Wed, Jan 12, 2011 at 4:10 AM, Jie Zhangj...@codesourcery.com wrote:

Dear Steering Committee:

The current listed maintainer for option handling is:

option handling Neil Booth n...@daikokuya.co.uk

But I'm wondering if Neil is still active. There are no replies to my
recent
pings from that email address. The last recorded commit from him in
GCC was
on 2005-01-19, which was nearly 6 years ago. So I guess he might have
not
worked on GCC. If this is true, how about assigning a new maintainer for
option handling?


Option handling maintainership should be unified with driver
maintainership
IMNSHO, as they are closely related.


I agree. I think Joseph is the best candidate for the maintainer of the
option handling since he made the most changes of gcc/opts-common.c. He
is already the maintainer of the driver. If we unify these two
maintainerships, we save one line of MAINTAINERS. :-)



Regards,
--
Jie Zhang



Re: Find a new maintainer for option handling?

2011-01-16 Thread Jie Zhang

On 01/17/2011 10:35 AM, Gerald Pfeifer wrote:

On Wed, 12 Jan 2011, Jie Zhang wrote:

I agree. I think Joseph is the best candidate for the maintainer of the
option handling since he made the most changes of gcc/opts-common.c. He
is already the maintainer of the driver. If we unify these two
maintainerships, we save one line of MAINTAINERS. :-)


I am not so much concerned about that one line in MAINTAINERS, more


Saving one line is just my stupid joke. :-P


finding someone who is willing to take on the role.  I, too, think
Joseph would be a great candidate, but it's his call whether he wants
to. ;-)  (I'll be happy to raise it on the SC in case.)


Thanks!

--
Jie Zhang



Re: Find a new maintainer for option handling?

2011-01-12 Thread Jie Zhang

On 01/12/2011 06:07 PM, Richard Guenther wrote:

On Wed, Jan 12, 2011 at 4:10 AM, Jie Zhangj...@codesourcery.com  wrote:

Dear Steering Committee:

The current listed maintainer for option handling is:

option handling Neil Booth  n...@daikokuya.co.uk

But I'm wondering if Neil is still active. There are no replies to my recent
pings from that email address. The last recorded commit from him in GCC was
on 2005-01-19, which was nearly 6 years ago. So I guess he might have not
worked on GCC. If this is true, how about assigning a new maintainer for
option handling?


Option handling maintainership should be unified with driver maintainership
IMNSHO, as they are closely related.

I agree. I think Joseph is the best candidate for the maintainer of the 
option handling since he made the most changes of gcc/opts-common.c. He 
is already the maintainer of the driver. If we unify these two 
maintainerships, we save one line of MAINTAINERS. :-)



Regards,
--
Jie Zhang



Find a new maintainer for option handling?

2011-01-11 Thread Jie Zhang

Dear Steering Committee:

The current listed maintainer for option handling is:

option handling Neil Booth  n...@daikokuya.co.uk

But I'm wondering if Neil is still active. There are no replies to my 
recent pings from that email address. The last recorded commit from him 
in GCC was on 2005-01-19, which was nearly 6 years ago. So I guess he 
might have not worked on GCC. If this is true, how about assigning a new 
maintainer for option handling?



Regards,
--
Jie Zhang



Re: Behavior change of driver on multiple input assembly files

2011-01-04 Thread Jie Zhang

On 01/04/2011 07:33 AM, Ian Lance Taylor wrote:

On Thu, Dec 30, 2010 at 9:07 PM, Jie Zhangj...@codesourcery.com  wrote:


For a minimal fix, I propose to change combinable fields of assembly
languages in default_compilers[] to 0. See the attached patch
gcc-not-combine-assembly-inputs.diff. I don't know why the combinable
fields were set to 1 when --combine option was introduced. There is no
explanation about that in that patch email.[2] Does anyone still remember?


This patch is OK if it fixes PR 47137.  Please mention the PR in the
ChangeLog entry.

Thanks. I have committed it now. I also posted it to gcc-patches mailing 
list with an updated ChangeLog entry:


http://gcc.gnu.org/ml/gcc-patches/2011-01/msg00122.html

--
Jie Zhang



Re: Behavior change of driver on multiple input assembly files

2010-12-31 Thread Jie Zhang

On 12/31/2010 01:07 PM, Jie Zhang wrote:

I just found a behavior change of driver on multiple input assembly
files. Previously (before r164357), for the command line

gcc -o t t1.s t2.s

, the driver will call assembler twice, once for t1.s and once for t2.s.
After r164357, the driver will only call assembler once for t1.s and
t2.s. Then if t1.s and t2.s have same symbol, assembler will report an
error, like:

t2.s: Assembler messages:
t2.s:1: Error: symbol `.L1' is already defined

I read the discussion on the mailing list starting by the patch email of
r164357.[1] It seems that this behavior change is not the intention of
that patch. And I think the previous behavior is more useful than the
current behavior. So it's good to restore the previous behavior, isn't?

For a minimal fix, I propose to change combinable fields of assembly
languages in default_compilers[] to 0. See the attached patch
gcc-not-combine-assembly-inputs.diff. I don't know why the combinable
fields were set to 1 when --combine option was introduced. There is no
explanation about that in that patch email.[2] Does anyone still remember?

For an aggressive fix, how about removing the combinable field from
struct compiler? If we change combinable fields of assembly languages
in default_compilers[] to 0, only .go and @cpp-output set combinable
to 1. I don't see any reason for difference between @cpp-output and
.i. So if we can set combinable to 0 for .go, we have 0 for all
compilers in default_compilers[], thus we can remove that field. Is
there a reason to set 1 for .go?

I also attached the aggressive patch gcc-remove-combinable-field.diff.
Either patch is not tested. Which way should we go?

The minimal fix has no regressions. But the aggressive one has a lot of 
regressions.



[1] http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01322.html
[2] http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01880.html


Regards,



--
Jie Zhang



Behavior change of driver on multiple input assembly files

2010-12-30 Thread Jie Zhang
I just found a behavior change of driver on multiple input assembly 
files. Previously (before r164357), for the command line


gcc -o t t1.s t2.s

, the driver will call assembler twice, once for t1.s and once for t2.s. 
After r164357, the driver will only call assembler once for t1.s and 
t2.s. Then if t1.s and t2.s have same symbol, assembler will report an 
error, like:


t2.s: Assembler messages:
t2.s:1: Error: symbol `.L1' is already defined

I read the discussion on the mailing list starting by the patch email of 
r164357.[1] It seems that this behavior change is not the intention of 
that patch. And I think the previous behavior is more useful than the 
current behavior. So it's good to restore the previous behavior, isn't?


For a minimal fix, I propose to change combinable fields of assembly 
languages in default_compilers[] to 0. See the attached patch 
gcc-not-combine-assembly-inputs.diff. I don't know why the combinable 
fields were set to 1 when --combine option was introduced. There is no 
explanation about that in that patch email.[2] Does anyone still remember?


For an aggressive fix, how about removing the combinable field from 
struct compiler? If we change combinable fields of assembly languages 
in default_compilers[] to 0, only .go and @cpp-output set combinable 
to 1. I don't see any reason for difference between @cpp-output and 
.i. So if we can set combinable to 0 for .go, we have 0 for all 
compilers in default_compilers[], thus we can remove that field. Is 
there a reason to set 1 for .go?


I also attached the aggressive patch gcc-remove-combinable-field.diff. 
Either patch is not tested. Which way should we go?


[1] http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01322.html
[2] http://gcc.gnu.org/ml/gcc-patches/2004-03/msg01880.html


Regards,
--
Jie Zhang



	* gcc.c (default_compilers[]): Set combinable field to 0
	for all assembly languages.

Index: gcc.c
===
--- gcc.c	(revision 168362)
+++ gcc.c	(working copy)
@@ -935,11 +935,11 @@ static const struct compiler default_com
   {.i, @cpp-output, 0, 0, 0},
   {@cpp-output,
%{!M:%{!MM:%{!E:cc1 -fpreprocessed %i %(cc1_options) %{!fsyntax-only:%(invoke_as), 0, 1, 0},
-  {.s, @assembler, 0, 1, 0},
+  {.s, @assembler, 0, 0, 0},
   {@assembler,
-   %{!M:%{!MM:%{!E:%{!S:as %(asm_debug) %(asm_options) %i %A , 0, 1, 0},
-  {.sx, @assembler-with-cpp, 0, 1, 0},
-  {.S, @assembler-with-cpp, 0, 1, 0},
+   %{!M:%{!MM:%{!E:%{!S:as %(asm_debug) %(asm_options) %i %A , 0, 0, 0},
+  {.sx, @assembler-with-cpp, 0, 0, 0},
+  {.S, @assembler-with-cpp, 0, 0, 0},
   {@assembler-with-cpp,
 #ifdef AS_NEEDS_DASH_FOR_PIPED_INPUT
%(trad_capable_cpp) -lang-asm %(cpp_options) -fno-directives-only\
@@ -952,7 +952,7 @@ static const struct compiler default_com
   %{!M:%{!MM:%{!E:%{!S:-o %|.s |\n\
as %(asm_debug) %(asm_options) %m.s %A 
 #endif
-   , 0, 1, 0},
+   , 0, 0, 0},
 
 #include specs.h
   /* Mark end of table.  */
Index: gcc.c
===
--- gcc.c	(revision 168362)
+++ gcc.c	(working copy)
@@ -847,8 +847,6 @@ struct compiler
   const char *cpp_spec; /* If non-NULL, substitute this spec
    for `%C', rather than the usual
    cpp_spec.  */
-  const int combinable;  /* If nonzero, compiler can deal with
-multiple source files at once (IMA).  */
   const int needs_preprocessing; /* If nonzero, source files need to
 be run through a preprocessor.  */
 };
@@ -876,29 +874,29 @@ static const struct compiler default_com
  were not present when we built the driver, we will hit these copies
  and be given a more meaningful error than file not used since
  linking is not done.  */
-  {.m,  #Objective-C, 0, 0, 0}, {.mi,  #Objective-C, 0, 0, 0},
-  {.mm, #Objective-C++, 0, 0, 0}, {.M, #Objective-C++, 0, 0, 0},
-  {.mii, #Objective-C++, 0, 0, 0},
-  {.cc, #C++, 0, 0, 0}, {.cxx, #C++, 0, 0, 0},
-  {.cpp, #C++, 0, 0, 0}, {.cp, #C++, 0, 0, 0},
-  {.c++, #C++, 0, 0, 0}, {.C, #C++, 0, 0, 0},
-  {.CPP, #C++, 0, 0, 0}, {.ii, #C++, 0, 0, 0},
-  {.ads, #Ada, 0, 0, 0}, {.adb, #Ada, 0, 0, 0},
-  {.f, #Fortran, 0, 0, 0}, {.F, #Fortran, 0, 0, 0},
-  {.for, #Fortran, 0, 0, 0}, {.FOR, #Fortran, 0, 0, 0},
-  {.ftn, #Fortran, 0, 0, 0}, {.FTN, #Fortran, 0, 0, 0},
-  {.fpp, #Fortran, 0, 0, 0}, {.FPP, #Fortran, 0, 0, 0},
-  {.f90, #Fortran, 0, 0, 0}, {.F90, #Fortran, 0, 0, 0},
-  {.f95, #Fortran, 0, 0, 0}, {.F95, #Fortran, 0, 0, 0},
-  {.f03, #Fortran, 0, 0, 0}, {.F03, #Fortran, 0, 0, 0},
-  {.f08, #Fortran, 0, 0, 0}, {.F08, #Fortran, 0, 0, 0},
-  {.r, #Ratfor, 0, 0, 0},
-  {.p, #Pascal, 0, 0, 0}, {.pas, #Pascal, 0, 0, 0},
-  {.java, #Java, 0, 0, 0}, {.class, #Java, 0, 0, 0},
-  {.zip, #Java, 0, 0, 0}, {.jar, #Java, 0, 0, 0},
-  {.go, #Go, 0, 1, 0},
+  {.m,  #Objective-C, 0, 0}, {.mi,  #Objective-C, 0, 0},
+  {.mm, #Objective-C++, 0, 0}, {.M, #Objective-C++, 0, 0},
+  {.mii

Re: Question on ARM legitimate address for DImode

2010-12-21 Thread Jie Zhang

On 12/21/2010 06:12 PM, Richard Earnshaw wrote:


On Tue, 2010-12-21 at 12:12 +0800, Jie Zhang wrote:

Hi,

While working on a bug, I found some code in ARM port that I don't
understand.

In ARM_LEGITIMIZE_RELOAD_ADDRESS and arm_legitimize_address, we allow a
very small offset for DImode addressing.

In ARM_LEGITIMIZE_RELOAD_ADDRESS:

if (MODE == DImode || (MODE == DFmode  TARGET_SOFT_FLOAT)) \
low = ((val  0xf) ^ 0x8) - 0x8;   \

In arm_legitimize_address

/* VFP addressing modes actually allow greater offsets, but for
   now we just stick with the lowest common denominator.  */
if (mode == DImode
|| ((TARGET_SOFT_FLOAT || TARGET_VFP)  mode == DFmode))
  {
low_n = n  0x0f;
n= ~0x0f;
if (low_n  4)
  {
n += 16;
low_n -= 16;
  }
  }

AFAIK, we could use two LDRs, or one LDRD, or one VLDR to access DImode
in memory when the address is in the form of (REG + CONST_INT). The
offset ranges for these three cases are:

LDR  -4095,4091
LDRD -255,255
VLDR -1020,1020  (ADDR  3) == 0


The original code was designed to exploit LDM(IA,IB,DB,DA) which would
have the offset ranges described.  On earlier ARM chips (certainly up to
and including ARM7TDMI) it was a significant win to do it that way (add
a constant to the address register and then use LDM was faster than two
LDR instructions).

That's no-longer true on modern chips, LDM is often slower than
individual LDR insns now.

Thanks! Now I see. So I think the original code is still needed but 
should be used only for such earlier ARM chips. I will send the updated 
patch to gcc-patches mailing list.



--
Jie Zhang


Question on ARM legitimate address for DImode

2010-12-20 Thread Jie Zhang

Hi,

While working on a bug, I found some code in ARM port that I don't 
understand.


In ARM_LEGITIMIZE_RELOAD_ADDRESS and arm_legitimize_address, we allow a 
very small offset for DImode addressing.


In ARM_LEGITIMIZE_RELOAD_ADDRESS:

if (MODE == DImode || (MODE == DFmode  TARGET_SOFT_FLOAT)) \
  low = ((val  0xf) ^ 0x8) - 0x8;   \

In arm_legitimize_address

  /* VFP addressing modes actually allow greater offsets, but for
 now we just stick with the lowest common denominator.  */
  if (mode == DImode
  || ((TARGET_SOFT_FLOAT || TARGET_VFP)  mode == DFmode))
{
  low_n = n  0x0f;
  n = ~0x0f;
  if (low_n  4)
{
  n += 16;
  low_n -= 16;
}
}

AFAIK, we could use two LDRs, or one LDRD, or one VLDR to access DImode 
in memory when the address is in the form of (REG + CONST_INT). The 
offset ranges for these three cases are:


LDR  -4095,4091
LDRD -255,255
VLDR -1020,1020  (ADDR  3) == 0

so the lowest common denominator is

-1020,1020  (ADDR  3) == 0 if ! TARGET_LDRD
 -255,255   (ADDR  3) == 0 if TARGET_LDRD

Both are much larger than what we have now in the ARM port.

Did I miss some other cases? That two pieces of code are rather old 
(more than 15 years). The main code was added by


svn: revision 7536 by erich, Thu Jun 23 16:02:41 1994 UTC in arm.h
git: fac435147512513c1b8fa55bee061c8e3a767ba9
log: (LEGITIMIZE_ADDRESS): Push constants that will never be legitimate 
-- symbols

 and labels -- into registers.  Handle DImode better.

I checked out that revision to take a look but didn't find an obvious 
reason for such small index range. Did I miss something tricky?


If there is nothing I missed, I'd like to propose the attached patch.


Regards,
--
Jie Zhang

Index: config/arm/arm.c
===
--- config/arm/arm.c	(revision 168085)
+++ config/arm/arm.c	(working copy)
@@ -6221,13 +6221,9 @@ arm_legitimize_address (rtx x, rtx orig_
 	  if (mode == DImode
 	  || ((TARGET_SOFT_FLOAT || TARGET_VFP)  mode == DFmode))
 	{
-	  low_n = n  0x0f;
-	  n = ~0x0f;
-	  if (low_n  4)
-		{
-		  n += 16;
-		  low_n -= 16;
-		}
+	  HOST_WIDE_INT mask = (TARGET_LDRD ? 0xfc : 0x3fc);
+	  low_n = (n = 0 ? (n  mask) : -((-n)  mask));
+	  n -= low_n;
 	}
 	  else
 	{
Index: config/arm/arm.h
===
--- config/arm/arm.h	(revision 168085)
+++ config/arm/arm.h	(working copy)
@@ -1283,7 +1283,12 @@ enum reg_class
 	  HOST_WIDE_INT low, high;	   \
 	   \
 	  if (MODE == DImode || (MODE == DFmode  TARGET_SOFT_FLOAT))	   \
-	low = ((val  0xf) ^ 0x8) - 0x8;   \
+	{   \
+	  /* VFP addressing modes actually allow greater offsets, but for \
+		 now we just stick with the lowest common denominator.  */ \
+	  HOST_WIDE_INT mask = (TARGET_LDRD ? 0xfc : 0x3fc);	   \
+	  low = (val = 0 ? (val  mask) : -((-val)  mask));  \
+	}   \
 	  else if (TARGET_MAVERICK  TARGET_HARD_FLOAT)		   \
 	/* Need to be careful, -256 is not a valid offset.  */	   \
 	low = val = 0 ? (val  0xff) : -((-val)  0xff);		   \


Re: Questions about selective scheduler and PowerPC

2010-10-22 Thread Jie Zhang

On 10/23/2010 01:50 AM, Pat Haugen wrote:

On 10/20/2010 7:48 PM, Jie Zhang wrote:

Running CPU2006, with the hack removed I see about a 1% improvement in
specint (10% in 456.hmmer, a couple others in the 3% range, -3%
401.bzip2) and a 1% degradation in specfp (mainly due to a 13%
degradation in 435.gromacs). But 454.calculix also fails for me (output
miscompare), so assume we're generating incorrect code for some reason
with the hack removed.


Thanks for benchmarking! Since there is a bug in max_issue, issue_rate
is not really honored. Could you try this patch

http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01719.html

with and without the hack?



With your patch applied I see pretty similar results as before, except
for a couple additional specint benchmarks that degraded a couple
percent with the hack removed.


Thanks for testing! Seems rs6000 port still has to keep that hack for now.

--
Jie Zhang
CodeSourcery


Re: Questions about selective scheduler and PowerPC

2010-10-20 Thread Jie Zhang

On 10/21/2010 04:08 AM, Pat Haugen wrote:

On 10/18/2010 10:33 AM, Jeff Law wrote:

On 10/18/10 09:22, David Edelsohn wrote:

On Mon, Oct 18, 2010 at 8:27 AM, Nathan
Froydfroy...@codesourcery.com wrote:

On Mon, Oct 18, 2010 at 02:49:21PM +0800, Jie Zhang wrote:

3. The aforementioned rs6000 hack rs6000_issue_rate was added by

2003-03-03 David Edelsohnedels...@gnu.org

* config/rs6000/rs6000.c (rs6000_multipass_dfa_lookahead): Delete.
(TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD): Delete.
(rs6000_variable_issue): Do not return negative value.
(rs6000_issue_rate): Uniformly set issue rate to 1 for first
scheduling pass.

, which was more than 7 years ago. Is this still needed now?

I asked David about this on IRC several days ago. He indicated that it
was necessary to prevent the first scheduling pass from unnecessarily
increasing register pressure. I don't know whether anybody has actually
tested it with recent GCC, though presumably it did help when it was
installed.

I am not sure when it last was re-checked, but it was checked after
sched_pressure was added. When that option is not enabled, the
issue_rate change still helped.

Did anyone check this after Bernd's work to better handle allocation
of double-word pseudos in IRA? That code should be handling the false
conflicts created by movement of clobbers.


Running CPU2006, with the hack removed I see about a 1% improvement in
specint (10% in 456.hmmer, a couple others in the 3% range, -3%
401.bzip2) and a 1% degradation in specfp (mainly due to a 13%
degradation in 435.gromacs). But 454.calculix also fails for me (output
miscompare), so assume we're generating incorrect code for some reason
with the hack removed.

Thanks for benchmarking! Since there is a bug in max_issue, issue_rate 
is not really honored. Could you try this patch


http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01719.html

with and without the hack?


Regards,
--
Jie Zhang
CodeSourcery


Re: Questions about selective scheduler and PowerPC

2010-10-19 Thread Jie Zhang

On 10/18/2010 03:41 PM, Andrey Belevantsev wrote:

On 18.10.2010 11:31, Jie Zhang wrote:

Hi Andrey,

On 10/18/2010 03:13 PM, Andrey Belevantsev wrote:

Hi Jie,

On 18.10.2010 10:49, Jie Zhang wrote:


When this error happens, FENCE_ISSUED_INSNS (fence) is 2 and
issue_rate is
1. PowerPC 8540 is capable to issue 2 instructions in one cycle, but
rs6000_issue_rate lies to scheduler that it can only issue 1
instruction
before register relocation is done. See the following code:


See PR 45352. I've tried to fix this in the selective scheduler by
modeling the lying behavior in line with the haifa scheduler. Let me
know if the last patch from the PR audit trail doesn't work for you.

In addition, after the above patch goes in, I can make the selective
scheduler not try to jump through the hoops with putting correct sched
cycles on insns for targets which don't need it in their target_finish
hook. I guess powerpc needs this though, but x86-64 (for which PR 45342
was opened) almost surely does not.


Thanks for your reply. I just tried. That patch does not help for this
issue.

I see, I didn't touch the failing assert with the patch. Can you just
remove the assert and see if that helps for you? I cannot think of how
it can be relaxed and still be useful.

Removing the failing assert fixes the test case. But I wonder why not 
just get max_issue correct. I'm testing the attached patch. IMHO, 
max_issue looks confusing.


 * The concept of ISSUE POINT has never been used since the code landed 
in repository.


 * In the comment just before the function, it's mentioned that 
MAX_POINTS is the sum of points of all instructions in READY. But it 
does not match the code. The code only summarizes the points of the 
first MORE_ISSUE instructions. If later ISSUE_POINTS become not uniform, 
that piece of code should be redesigned.


So I think it's good to remove it now. And top - choice_stack is a 
good replacement for top-n. So we can remove field n from struct 
choice_entry, too.


Now I'm looking at MIPS target to find out why this change in the would 
cause PR37360.


   /* ??? We used to assert here that we never issue more insns than 
issue_rate.
  However, some targets (e.g. MIPS/SB1) claim lower issue rate than 
can be
  achieved to get better performance.  Until these targets are 
fixed to use
  scheduler hooks to manipulate insns priority instead, the assert 
should

- be disabled.
-
- gcc_assert (more_issue = 0);  */
+ be disabled.  */


--
Jie Zhang
CodeSourcery

	* haifa-sched.c (ISSUE_POINTS): Remove.
	(struct choice_entry): Remove field n.
	(max_issue): Don't issue more than issue_rate instructions.

Index: haifa-sched.c
===
--- haifa-sched.c	(revision 165642)
+++ haifa-sched.c	(working copy)
@@ -199,10 +199,6 @@ struct common_sched_info_def *common_sch
 /* The minimal value of the INSN_TICK of an instruction.  */
 #define MIN_TICK (-max_insn_queue_index)
 
-/* Issue points are used to distinguish between instructions in max_issue ().
-   For now, all instructions are equally good.  */
-#define ISSUE_POINTS(INSN) 1
-
 /* List of important notes we must keep around.  This is a pointer to the
last element in the list.  */
 rtx note_list;
@@ -2401,8 +2397,6 @@ struct choice_entry
   int index;
   /* The number of the rest insns whose issues we should try.  */
   int rest;
-  /* The number of issued essential insns.  */
-  int n;
   /* State after issuing the insn.  */
   state_t state;
 };
@@ -2444,8 +2438,7 @@ static int cached_issue_rate = 0;
insns is insns with the best rank (the first insn in READY).  To
make this function tries different samples of ready insns.  READY
is current queue `ready'.  Global array READY_TRY reflects what
-   insns are already issued in this try.  MAX_POINTS is the sum of points
-   of all instructions in READY.  The function stops immediately,
+   insns are already issued in this try.  The function stops immediately,
if it reached the such a solution, that all instruction can be issued.
INDEX will contain index of the best insn in READY.  The following
function is used only for first cycle multipass scheduling.
@@ -2458,7 +2451,7 @@ int
 max_issue (struct ready_list *ready, int privileged_n, state_t state,
 	   int *index)
 {
-  int n, i, all, n_ready, best, delay, tries_num, max_points;
+  int i, all, n_ready, best, delay, tries_num;
   int more_issue;
   struct choice_entry *top;
   rtx insn;
@@ -2477,25 +2470,15 @@ max_issue (struct ready_list *ready, int
 }
 
   /* Init max_points.  */
-  max_points = 0;
   more_issue = issue_rate - cycle_issued_insns;
 
   /* ??? We used to assert here that we never issue more insns than issue_rate.
  However, some targets (e.g. MIPS/SB1) claim lower issue rate than can be
  achieved to get better performance.  Until these targets are fixed to use
  scheduler hooks to manipulate insns priority instead

Re: Questions about selective scheduler and PowerPC

2010-10-19 Thread Jie Zhang

On 10/19/2010 10:16 PM, Andrey Belevantsev wrote:

On 19.10.2010 17:57, Jie Zhang wrote:

Removing the failing assert fixes the test case. But I wonder why not
just
get max_issue correct. I'm testing the attached patch. IMHO, max_issue
looks confusing.

* The concept of ISSUE POINT has never been used since the code landed in
repository.

* In the comment just before the function, it's mentioned that MAX_POINTS
is the sum of points of all instructions in READY. But it does not match
the code. The code only summarizes the points of the first MORE_ISSUE
instructions. If later ISSUE_POINTS become not uniform, that piece of
code
should be redesigned.

So I think it's good to remove it now. And top - choice_stack is a good
replacement for top-n. So we can remove field n from struct
choice_entry,
too.

Now I'm looking at MIPS target to find out why this change in the would
cause PR37360.

I agree that ISSUE_POINTS can be removed, as it was not used (maybe
Maxim can comment more on this). However, the assert is not about the
points but exactly about the situation when a target is lying to the
compiler about its issue rate.

The ideal situation is that we agree on that this should never happen,
but then you need to fix all targets that use this trick, and it seems
that there is at least mips, ppc, and x86-64 (which is why I pointed you
to 45352). The fix would be to find out why claiming the true issue rate
degrades performance and to implement the proper scheduling hooks for
changing priority of some insns, or to enable -fsched-pressure for the
offending targets.

I agree. But I still have a question about TARGET_SCHED_ISSUE_RATE. 
According to my understanding of gccint:


[quote]
Target Hook: int TARGET_SCHED_ISSUE_RATE (void)
[snip]
Although the insn scheduler can define itself the possibility of issue 
an insn on the same cycle, the value can serve as an additional 
constraint to issue insns on the same simulated processor cycle

[snip]
[/quote]

it should be allowed to be defined smaller than the issue rate defined 
by the scheduler DFA. So even if the backend defines a DFA which is 
capable to issue 4 instructions in one cycle but it also defines 
TARGET_SCHED_ISSUE_RATE to 3, the scheduler should restrict the number 
of instructions issued in one cycle to 3 instead of 4.


So I think this assert should hold even the backend lies to scheduler 
about the issue rate. Fixing the lies is another problem.


With the attached draft patch, we can enable the assert in max_issue 
without regression on PR37360.



This is a lot of work, which is why this assert was installed in
max_issue for relatively short amount of time. Maybe it's time to try
again, but let's have a consensus first that this assert should never
trigger by design and we have enough flexibility in the scheduler to
provide legal means to achieve the same performance effect.


Agree.


Regards,
--
Jie Zhang
CodeSourcery
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index b13d648..7653941 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -589,6 +589,10 @@ static const char *mips_hi_relocs[NUM_SYMBOL_TYPES];
 /* Target state for MIPS16.  */
 struct target_globals *mips16_globals;
 
+/* Cached value of can_issue_more. This is cached in mips_variable_issue hook
+   and returned from mips_sched_reorder2.  */
+static int cached_can_issue_more;
+
 /* Index R is the smallest register class that contains register R.  */
 const enum reg_class mips_regno_to_class[FIRST_PSEUDO_REGISTER] = {
   LEA_REGS,	LEA_REGS,	M16_REGS,	V1_REG,
@@ -12439,8 +12443,8 @@ mips_sched_init (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED,
 /* Implement TARGET_SCHED_REORDER and TARGET_SCHED_REORDER2.  */
 
 static int
-mips_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED,
-		rtx *ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED)
+mips_sched_reorder_1 (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED,
+		  rtx *ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED)
 {
   if (!reload_completed
TUNE_MACC_CHAINS
@@ -12455,10 +12459,25 @@ mips_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED,
 
   if (TUNE_74K)
 mips_74k_agen_reorder (ready, *nreadyp);
+}
 
+
+static int
+mips_sched_reorder (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED,
+		rtx *ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED)
+{
+  mips_sched_reorder_1 (file, verbose, ready, nreadyp, cycle);
   return mips_issue_rate ();
 }
 
+static int
+mips_sched_reorder2 (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED,
+		 rtx *ready, int *nreadyp, int cycle ATTRIBUTE_UNUSED)
+{
+  mips_sched_reorder_1 (file, verbose, ready, nreadyp, cycle);
+  return cached_can_issue_more;
+}
+
 /* Update round-robin counters for ALU1/2 and FALU1/2.  */
 
 static void
@@ -12516,6 +12535,7 @@ mips_variable_issue (FILE *file ATTRIBUTE_UNUSED, int verbose ATTRIBUTE_UNUSED,
 	  || recog_memoized (insn

Questions about selective scheduler and PowerPC

2010-10-18 Thread Jie Zhang

Hi,

I'm investigating a GCC testsuite FAIL of PowerPC with e500 multilib. 
The test is pr42245.c, which sets options to -O2 -fselective-scheduling 
-fsel-sched-pipelining.


$ ./cc1 -quiet pr42245.c -mcpu=8540 -mfloat-gprs=single -O2 
-fselective-scheduling

pr42245.c: In function ‘build_DIS_CON_tree’:
pr42245.c:29:1: internal compiler error: in advance_state_on_fence, at 
sel-sched.c:5288


The code around sel-sched.c:5288 looks like:

5265 static bool
5266 advance_state_on_fence (fence_t fence, insn_t insn)
5267 {
5268   bool asm_p;
5269
5270   if (recog_memoized (insn) = 0)
5271 {
5272   int res;
5273   state_t temp_state = alloca (dfa_state_size);
5274
5275   gcc_assert (!INSN_ASM_P (insn));
5276   asm_p = false;
5277
5278   memcpy (temp_state, FENCE_STATE (fence), dfa_state_size);
5279   res = state_transition (FENCE_STATE (fence), insn);
5280   gcc_assert (res  0);
5281
5282   if (memcmp (temp_state, FENCE_STATE (fence), dfa_state_size))
5283 {
5284   FENCE_ISSUED_INSNS (fence)++;
5285
5286   /* We should never issue more than issue_rate insns.  */
5287   if (FENCE_ISSUED_INSNS (fence)  issue_rate)
5288 gcc_unreachable ();
5289 }
5290 }
5291   else

When this error happens, FENCE_ISSUED_INSNS (fence) is 2 and issue_rate 
is 1. PowerPC 8540 is capable to issue 2 instructions in one cycle, but 
rs6000_issue_rate lies to scheduler that it can only issue 1 instruction 
before register relocation is done. See the following code:


23205 static int
23206 rs6000_issue_rate (void)
23207 {
23208   /* Unless scheduling for register pressure, use issue rate of 1 for
23209  first scheduling pass to decrease degradation.  */
23210   if (!reload_completed  !flag_sched_pressure)
23211 return 1;
23212
23213   switch (rs6000_cpu_attr) {
[snip]
23223   case CPU_PPC8540:
[snip]
23230 return 2;

This issue could be traced down to haifa-sched.c:max_issue (), which 
returns 2 even issue_rate is 1. So my questions and possible ways to fix 
it are:


1. Should we restrict max_issue to only return value less than or equal 
to issue_rate?


2. Should we do the same as what SMS does? See

static void
sms_schedule (void)
{
[snip]
  /* Initialize issue_rate.  */
  if (targetm.sched.issue_rate)
{
  int temp = reload_completed;

  reload_completed = 1;
  issue_rate = targetm.sched.issue_rate ();
  reload_completed = temp;
}
  else
issue_rate = 1;
[snip]
}

I suspect this piece code in sms_schedule was written for rs6000, but it 
comes as the first commit of SMS merge and there is no patch email 
explaining it.


3. The aforementioned rs6000 hack rs6000_issue_rate was added by

2003-03-03  David Edelsohn  edels...@gnu.org

* config/rs6000/rs6000.c (rs6000_multipass_dfa_lookahead): Delete.
(TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD): Delete.
(rs6000_variable_issue): Do not return negative value.
(rs6000_issue_rate): Uniformly set issue rate to 1 for first
scheduling pass.

, which was more than 7 years ago. Is this still needed now?


Any one of the above three ways can fix the FAIL. But I'm not sure which 
way is best, or maybe we should do 1 and 3 and remove the hack in 2?


Thoughts?


Regards,
--
Jie Zhang
CodeSourcery


Re: Questions about selective scheduler and PowerPC

2010-10-18 Thread Jie Zhang

Hi Andrey,

On 10/18/2010 03:13 PM, Andrey Belevantsev wrote:

Hi Jie,

On 18.10.2010 10:49, Jie Zhang wrote:


When this error happens, FENCE_ISSUED_INSNS (fence) is 2 and
issue_rate is
1. PowerPC 8540 is capable to issue 2 instructions in one cycle, but
rs6000_issue_rate lies to scheduler that it can only issue 1 instruction
before register relocation is done. See the following code:


See PR 45352. I've tried to fix this in the selective scheduler by
modeling the lying behavior in line with the haifa scheduler. Let me
know if the last patch from the PR audit trail doesn't work for you.

In addition, after the above patch goes in, I can make the selective
scheduler not try to jump through the hoops with putting correct sched
cycles on insns for targets which don't need it in their target_finish
hook. I guess powerpc needs this though, but x86-64 (for which PR 45342
was opened) almost surely does not.

Thanks for your reply. I just tried. That patch does not help for this 
issue.



--
Jie Zhang
CodeSourcery


gcc.dg/graphite/interchange-9.c and small memory target

2010-08-11 Thread Jie Zhang

Hi Sebastian,

I currently encountered an issue when testing 
gcc.dg/graphite/interchange-9.c on a ARM bare-metal board which has only 
4MB memory.


Apparently, with

#define N 
#define M 

int A[N*M] in main is too large to fit in stack.

There are several ways to solve this issue:

1. Make this test a compile test instead of a run test.

2. Define both M and N to 111. I checked and the test is still valid, ie 
it still tests what is intended.


3. Use STACK_SIZE macro to calculate M and N. But I don't know how to do 
that. And I'm not sure if we got a very small M and N, the test will be 
still valid.


Which way do you like most?


Regards,
--
Jie Zhang
CodeSourcery


Re: gcc.dg/graphite/interchange-9.c and small memory target

2010-08-11 Thread Jie Zhang

On 08/11/2010 11:47 PM, Sebastian Pop wrote:

On Wed, Aug 11, 2010 at 10:29, Jie Zhangj...@codesourcery.com  wrote:

Hi Sebastian,

I currently encountered an issue when testing
gcc.dg/graphite/interchange-9.c on a ARM bare-metal board which has only 4MB
memory.

Apparently, with

#define N 
#define M 

int A[N*M] in main is too large to fit in stack.

There are several ways to solve this issue:

1. Make this test a compile test instead of a run test.

2. Define both M and N to 111. I checked and the test is still valid, ie it
still tests what is intended.

3. Use STACK_SIZE macro to calculate M and N. But I don't know how to do
that. And I'm not sure if we got a very small M and N, the test will be
still valid.

Which way do you like most?


I would say, let's go for solution 2.

I don't like the first solution as you want to also validate
that the transform is correct.   As for solution 3, I do not know
either how to do that.

I will keep in mind these limitations for the future testcases.


Thanks. I will submit a patch for solution 2.

--
Jie Zhang
CodeSourcery


Re: GCC Bugzilla is broken now

2010-07-13 Thread Jie Zhang

On 07/13/2010 11:13 AM, Jie Zhang wrote:

I got this when trying to access
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44921


Software error:

Can't rename data/versioncache.Xg5KN to versioncache at globals.pl
line 306.

For help, please send mail to the webmaster
(sourcemas...@sourceware.org), giving this error message and the time
and date of the error.



It has recovered now. Thanks!


--
Jie Zhang
CodeSourcery


GCC Bugzilla is broken now

2010-07-12 Thread Jie Zhang
I got this when trying to access 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44921



Software error:

Can't rename data/versioncache.Xg5KN to versioncache at globals.pl line 306.

For help, please send mail to the webmaster (sourcemas...@sourceware.org), 
giving this error message and the time and date of the error.


It was OK just two or three hours ago.


--
Jie Zhang
CodeSourcery


Re: A question about patch submission

2010-07-12 Thread Jie Zhang

On 07/13/2010 11:56 AM, Mingming Sun wrote:

Hi,

I want to submit a patch about loongson 3A, a new architecture
different from loongson 2E  2F.
My patch is based on Gcc4.4.0.
If I want to submit my patch, which branch shoud I submit to, gcc4.4.0
branch or I should change the patch to
suite with the main branch.


Yes. You need port your patch to SVN trunk before submit it.


--
Jie Zhang
CodeSourcery


Re: A question about patch submission

2010-07-12 Thread Jie Zhang

On 07/13/2010 12:20 PM, Mingming Sun wrote:

On Tue, Jul 13, 2010 at 12:17 PM, Jie Zhangj...@codesourcery.com  wrote:

On 07/13/2010 11:56 AM, Mingming Sun wrote:

I want to submit a patch about loongson 3A, a new architecture
different from loongson 2E2F.
My patch is based on Gcc4.4.0.
If I want to submit my patch, which branch shoud I submit to, gcc4.4.0
branch or I should change the patch to
suite with the main branch.


Yes. You need port your patch to SVN trunk before submit it.


 Do you mean I must submit it to the main brach?

Please don't top reply. (I have moved your reply down.)

I'm not sure if I understand you. GCC source code is maintained in SVN 
repository, which has a trunk and many branches. The trunk is where the 
main development is going on. gcc4.4.0 is not a branch. It is a release, 
which is created from branch gcc-4_4-branch. When you submit a patch 
like this, you need to create your patch against SVN trunk instead of 
any branch or tag or release tarball.



--
Jie Zhang
CodeSourcery


Re: complete list of emulated TLS targets.

2010-07-08 Thread Jie Zhang
On Thu, Jul 8, 2010 at 9:28 PM, Bernd Schmidt ber...@codesourcery.com wrote:
 On 07/06/2010 10:39 PM, IainS wrote:
 I'd like to compile a complete list of targets affected by changes in
 emulated TLS.

 *-*-darwin*
 hppa64-hp-hpux11.11
 cris-*-elf

 I think also;

 *-*-mingw
 *-*-cygwin

 could people please add to the list/confirm as appropriate?

 I'm pretty sure bfin* is on the list.

Yes. bfin-uclinux and bfin-linux-uclibc.


Jie


Question on REG_EQUAL documentation

2010-05-31 Thread Jie Zhang

The GCC internal document says [1]:

[quote]
In the early stages of register allocation, a REG_EQUAL note is changed 
into a REG_EQUIV note if op is a constant and the insn represents the 
only set of its destination register.


Thus, compiler passes prior to register allocation need only check for 
REG_EQUAL notes and passes subsequent to register allocation need only 
check for REG_EQUIV notes.

[/quote]

But I still find REG_EQUAL notes in RTL dumps for those passes after 
IRA. My understanding is: a REG_EQUAL note is changed into a REG_EQUIV 
note in IRA when possible, but the remaining REG_EQUAL notes are still 
kept around. So the compiler passes after register allocation need check 
for both REG_EQUIV notes and REG_EQUAL notes. Is my understanding correct?



[1] http://gcc.gnu.org/onlinedocs/gccint/Insns.html#index-REG_005fEQUIV-2258


--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


GCC viewcvs issue

2010-04-29 Thread Jie Zhang

This URL

http://gcc.gnu.org/viewcvs/branches/gcc-4_4-branch/gcc/tree-ssa-alias.c?annotate=155646

which tries to annotate the latest revision of tree-ssa-alias.c on 4.4 
branch gives


An Exception Has Occurred
Python Traceback

Traceback (most recent call last):
  File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 
4317, in main

request.run_viewvc()
  File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 
397, in run_viewvc

self.view_func(self)
  File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 
1769, in view_annotate

markup_or_annotate(request, 1)
  File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 
1696, in markup_or_annotate

path[-1], mime_type)
  File /usr/lib/python2.3/site-packages/viewvc/lib/viewvc.py, line 
1589, in markup_stream_pygments

encoding='utf-8'), ps)
  File /usr/lib/python2.3/site-packages/pygments/__init__.py, line 
85, in highlight

return format(lex(code, lexer), formatter, outfile)
  File /usr/lib/python2.3/site-packages/pygments/__init__.py, line 
68, in format

formatter.format(tokens, outfile)
  File /usr/lib/python2.3/site-packages/pygments/formatter.py, line 
92, in format

return self.format_unencoded(tokensource, outfile)
  File /usr/lib/python2.3/site-packages/pygments/formatters/html.py, 
line 704, in format_unencoded

for t, piece in source:
  File /usr/lib/python2.3/site-packages/pygments/formatters/html.py, 
line 611, in _format_lines

for ttype, value in tokensource:
  File /usr/lib/python2.3/site-packages/pygments/lexer.py, line 162, 
in streamer

for i, t, v in self.get_tokens_unprocessed(text):
  File /usr/lib/python2.3/site-packages/pygments/lexers/compiled.py, 
line 155, in get_tokens_unprocessed

for index, token, value in \
  File /usr/lib/python2.3/site-packages/pygments/lexer.py, line 479, 
in get_tokens_unprocessed

m = rexmatch(text, pos)
RuntimeError: maximum recursion limit exceeded


Similar issue for 4.3 branch. trunk, 4.2 and 4.1 are OK.


Regards,
--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Re: gcc- 4.6.0 20100416 rtmutex.c:1138:1: internal compiler error

2010-04-19 Thread Jie Zhang

On 04/19/2010 02:43 PM, Justin P. Mattock wrote:

I couldn't resist..(had to play),
anyways I looked through the reports
but didn't see anything that was
familiar. so I went and created an entry:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43791

Thanks. Please add a preprocessed source file so people can reproduce 
your issue.



--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Re: gcc- 4.6.0 20100416 rtmutex.c:1138:1: internal compiler error

2010-04-18 Thread Jie Zhang

On 04/19/2010 12:19 PM, Justin Mattock wrote:

so far I've compiled most of the system
(glibc,Xserver,etc..)
and not really anything has crashed and burned
except for the kernel:

kernel/rtmutex.c: At top level:
kernel/rtmutex.c:1138:1: internal compiler error: in
cgraph_decide_inlining_of_small_functions, at ipa-inline.c:1009
Please submit a full bug report,
with preprocessed source if appropriate.
Seehttp://gcc.gnu.org/bugs.html  for instructions.
make[1]: *** [kernel/rtmutex.o] Error 1
make: *** [kernel] Error 2

any reports of this/ideas?


Please report a bug with a preprocessed source file in GCC's bugzilla.

--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Re: Ask for suggestions on init_caller_save

2010-03-29 Thread Jie Zhang

On 03/30/2010 12:11 AM, Jeff Law wrote:

On 03/23/10 21:30, Jie Zhang wrote:

I'm fixing a bug. It's caused by uninitialized caller save pass data.
One function in the test case uses the optimize attribute with O2
option. So even with -O0 in command line, GCC calls caller save pass
for that function. The problem is init_caller_save is called in
backend_inti_target if flag_caller_saves is set. Apparently, in this
case, flag_caller_saves is not set when came to backend_inti_target. I
think there are several ways to fix this bug, but I don't know which
way should/can I go:

1. Always call init_caller_save in backend_inti_target. But it seems a
waste for most cases if -O0.

2. Call init_caller_save in IRA main function. But by this way it will
be called multiple times unless we create a flag to remember if it has
been called or not. Maybe we can reuse test_reg or test_mem. If they
are NULL_TREE, just call init_caller_save.

3. Call init_caller_save in handle_optimize_attribute. If
flag_caller_saves is not set before parse_optimize_options but set
after, call init_caller_save. Considering there might be multiple
functions using optimize attribute, we also need a flag to remember if
init_caller_save has been called or not.

4. There are only three global function in caller-save.c:
init_save_areas, setup_save_areas, and save_call_clobbered_regs. We
can just add a check in the beginning of those functions. If the data
has not been initialized, just init_caller_save first.


Any suggestions?

I'd suggest #2 with a status flag indicating whether or not caller-saves
has been initialized. That should be low enough overhead to not be a
problem.


Thanks. I will send a patch to gcc-patches and CC you.

--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Question about gen_rtx_VAR_LOCATION

2010-03-26 Thread Jie Zhang
There are two calls of gen_rtx_VAR_LOCATION in cfgexpand.c. Both calls 
cast a tree to rtx as the third argument. Why a tree is used in RTL 
expression? Will it be transformed into RTL later or all RTL passes 
should recognize it's a tree and just ignore it? Thanks.



--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Re: Question about gen_rtx_VAR_LOCATION

2010-03-26 Thread Jie Zhang

On 03/26/2010 11:36 PM, Jakub Jelinek wrote:

On Fri, Mar 26, 2010 at 11:27:24PM +0800, Jie Zhang wrote:

There are two calls of gen_rtx_VAR_LOCATION in cfgexpand.c. Both calls
cast a tree to rtx as the third argument. Why a tree is used in RTL
expression? Will it be transformed into RTL later or all RTL passes
should recognize it's a tree and just ignore it? Thanks.


Yes, it is just temporary.  The tree survives there just from
the calls from within expand_gimple_basic_block until the immediately
following expand_debug_locations call.

Hmmm, I found a case that gen_rtx_VAR_LOCATION is called but the tree in 
VAR_LOCATION does not get expanded. It's related to handling of optimize 
attribute. I will start a new thread for that. Thanks.



--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Question about RTL code hoisting

2010-03-25 Thread Jie Zhang

I just found that the current RTL code hoisting cannot optimize

REG = ...
if (cond)
  {
r0 = REG;

  }
else
  {
r0 = REG;
...
  }

to


REG = ...
r0 = REG;
if (cond)
  {

  }
else
  {
...
  }

where REG is a pseudo register and r0 is a physical register. I have 
looked at the code of RTL hoisting pass. But I cannot find a simple way 
to extend it to deal with this case. And the hoisting pass is only 
enabled when -Os. So I'm going to implement another hoisting pass to do 
this optimization. Is it a good idea? Does anyone know if there is an 
existing pass which should have handled or be able to be easily adapted 
to handle this case? Thanks!



--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Re: Question about RTL code hoisting

2010-03-25 Thread Jie Zhang

On 03/25/2010 11:22 PM, Jeff Law wrote:

On 03/25/10 09:14, Bernd Schmidt wrote:

On 03/25/2010 04:03 PM, Jie Zhang wrote:

I just found that the current RTL code hoisting cannot optimize

REG = ...
if (cond)
{
r0 = REG;

}
else
{
r0 = REG;
...
}

to


REG = ...
r0 = REG;
if (cond)
{

}
else
{
...
}

where REG is a pseudo register and r0 is a physical register. I have
looked at the code of RTL hoisting pass. But I cannot find a simple way
to extend it to deal with this case. And the hoisting pass is only
enabled when -Os. So I'm going to implement another hoisting pass to do
this optimization. Is it a good idea? Does anyone know if there is an
existing pass which should have handled or be able to be easily adapted
to handle this case? Thanks!

Isn't this similar to crossjumping, except done in the other direction?

Yes, though the implementation is completely different. Hoisting
computes which blocks compute specific expressions, then determines if
all paths from a point compute the same expression and if so, moves the
multiple computations to a single location.

cross jumping works by matching RTL bits at the end of blocks.

I never bothered to implement hoisting which touched hard regs -- I
never thought the cost/benefit analysis made much sense. It's quite a
bit more work to implement and code motion of hard regs is much more
restricted than code motion involving just pseudos.


Thanks Bernd and Jeff.

This case is not common. I'm wondering how likely this kind of 
optimization pass will be accepted into GCC.


Another way to fix it is teaching register allocator to allocate r0 to 
REG. But I guess it's more difficult.



Jie


Re: Question about RTL code hoisting

2010-03-25 Thread Jie Zhang

On 03/25/2010 11:24 PM, Steven Bosscher wrote:

On Thu, Mar 25, 2010 at 4:03 PM, Jie Zhangj...@codesourcery.com  wrote:

I just found that the current RTL code hoisting cannot optimize

REG = ...
if (cond)
  {
r0 = REG;

  }
else
  {
r0 = REG;
...
  }

to


REG = ...
r0 = REG;
if (cond)
  {

  }
else
  {
...
  }

where REG is a pseudo register and r0 is a physical register. I have looked
at the code of RTL hoisting pass. But I cannot find a simple way to extend
it to deal with this case.


Right, there are two issues:
* HOIST doesn't handle hard registers
* HOIST doesn't hoist reg-reg moves

There is no easy way to add cost metrics to hoist reg-reg moves, and
handling hard regs is an even bigger problem.

What is the original code? I (well, by now: we) have patches in the
works for GCC 4.6 that add code hoisting to GIMPLE (see PR23286),
perhaps that solves this case for you.

I'm not sure if I can make the test case public. I need to ask. I'm 
afraid gimple cannot help this. r0 is here because it's used for passing 
argument to callees. This issue is only exposed when expanded to RTL.





And the hoisting pass is only enabled when -Os.
So I'm going to implement another hoisting pass to do this optimization. Is
it a good idea?


To add duplicate functionality? No.

I'm not going to add duplicate functionality. What I'm going to do is 
only handle hard-reg = pseudo-reg case.



Does anyone know if there is an existing pass which should
have handled or be able to be easily adapted to handle this case?


Hoisting should handle it, bui


Can you open a new PR and make it block PR33828, please?


If I can publish the test case, yes. Or I have to rewrite a test case.


--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Ask for suggestions on init_caller_save

2010-03-23 Thread Jie Zhang
I'm fixing a bug. It's caused by uninitialized caller save pass data. 
One function in the test case uses the optimize attribute with O2 
option. So even with -O0 in command line, GCC calls caller save pass for 
that function. The problem is init_caller_save is called in 
backend_inti_target if flag_caller_saves is set. Apparently, in this 
case, flag_caller_saves is not set when came to backend_inti_target. I 
think there are several ways to fix this bug, but I don't know which way 
should/can I go:


1. Always call init_caller_save in backend_inti_target. But it seems a 
waste for most cases if -O0.


2. Call init_caller_save in IRA main function. But by this way it will 
be called multiple times unless we create a flag to remember if it has 
been called or not. Maybe we can reuse test_reg or test_mem. If they are 
NULL_TREE, just call init_caller_save.


3. Call init_caller_save in handle_optimize_attribute. If 
flag_caller_saves is not set before parse_optimize_options but set 
after, call init_caller_save. Considering there might be multiple 
functions using optimize attribute, we also need a flag to remember if 
init_caller_save has been called or not.


4. There are only three global function in caller-save.c: 
init_save_areas, setup_save_areas, and save_call_clobbered_regs. We can 
just add a check in the beginning of those functions. If the data has 
not been initialized, just init_caller_save first.



Any suggestions?

Thanks in advance.

--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Question about removing multiple elements from VEC

2010-03-16 Thread Jie Zhang

Hi,

I'm looking at this FIXME in cp/typeck2.c.

  /* FIXME: Ordered removal is O(1) so the whole function is
 worst-case quadratic. This could be fixed using an aside
 bitmap to record which elements must be removed and remove
 them all at the same time. Or by merging
 split_non_constant_init into process_init_constructor_array,
 that is separating constants from non-constants while building
 the vector.  */
  VEC_ordered_remove (constructor_elt, CONSTRUCTOR_ELTS (init),
  idx);

It seems there is no VEC function which can use a bitmap to do a ordered 
multiple remove. Did I miss something or I have to write one?



Regards,
--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Re: Question about removing multiple elements from VEC

2010-03-16 Thread Jie Zhang

On 03/17/2010 12:08 AM, Richard Guenther wrote:

On Tue, Mar 16, 2010 at 5:02 PM, Jie Zhangj...@codesourcery.com  wrote:

Hi,

I'm looking at this FIXME in cp/typeck2.c.

  /* FIXME: Ordered removal is O(1) so the whole function is
 worst-case quadratic. This could be fixed using an aside
 bitmap to record which elements must be removed and remove
 them all at the same time. Or by merging
 split_non_constant_init into process_init_constructor_array,
 that is separating constants from non-constants while building
 the vector.  */
  VEC_ordered_remove (constructor_elt, CONSTRUCTOR_ELTS (init),
  idx);

It seems there is no VEC function which can use a bitmap to do a ordered
multiple remove. Did I miss something or I have to write one?


You have to write one.


Thanks!

--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


IRA conflict graph, multiple alternatives and commutative operands

2010-03-08 Thread Jie Zhang
I'm looking at PR 42258. I have a question on IRA conflict graph and 
multiple alternatives.


Below is an RTL insn just before register allocation pass:

(insn 7 6 12 2 pr42258.c:2 (set (reg:SI 136)
(mult:SI (reg:SI 137)
(reg/v:SI 135 [ x ]))) 33 {*thumb_mulsi3})

IRA generates the following conflict graph for r135, r136 and r137:

;; a0(r136,l0) conflicts: a2(r137,l0) a1(r135,l0)
;; total conflict hard regs:
;; conflict hard regs:
;; a1(r135,l0) conflicts: a0(r136,l0) a2(r137,l0)
;; total conflict hard regs:
;; conflict hard regs:
;; a2(r137,l0) conflicts: a0(r136,l0) a1(r135,l0)
;; total conflict hard regs:
;; conflict hard regs:

  regions=1, blocks=3, points=5
allocnos=3, copies=0, conflicts=0, ranges=3

Apparently this conflict graph is not an optimized one for any of the 
three alternatives in the following instruction pattern:


(define_insn *thumb_mulsi3
  [(set (match_operand:SI  0 register_operand =l,l,l)
(mult:SI (match_operand:SI 1 register_operand %l,*h,0)
 (match_operand:SI 2 register_operand l,l,l)))]
  ...)

This conflict graph seems like a merge of conflict graphs of the three 
alternatives. Ideally for the first and second alternatives, we should have


;; a0(r136,l0) conflicts: a2(r137,l0) a1(r135,l0)
;; a1(r135,l0) conflicts: a0(r136,l0)
;; a2(r137,l0) conflicts: a0(r136,l0)

For the third alternative, we'd better have

;; a0(r136,l0) conflicts: a1(r135,l0)
;; a1(r135,l0) conflicts: a0(r136,l0)
  cp0:a0(r136)-a2(r137)@1000:constraint

And register allocator would use one of these more specific conflict 
graphs for coloring. If we take the commutative operands into count, we 
have to add the following conflict graph for choosing.


;; a0(r136,l0) conflicts: a2(r137,l0)
;; a2(r137,l0) conflicts: a0(r136,l0)
  cp0:a0(r136)-a1(r135)@1000:constraint

(Actually, this conflict graph will result in an optimal result for the 
test case in PR 42258.)


Now the problem is when and how to choose the alternative for register 
allocator to calculate the conflict graph?


Yes, I have read the thread:

http://gcc.gnu.org/ml/gcc/2009-02/msg00215.html

This question seems not easy. So is there any practical method to make 
register allocator pick up the third alternative and do commutation 
before or during register allocation?


Thanks,
--
Jie Zhang
CodeSourcery
(650) 331-3385 x735


Re: No integral promotions when calling library function?

2010-02-18 Thread Jie Zhang
On Fri, Feb 19, 2010 at 12:03 AM, Dave Korn
dave.korn.cyg...@googlemail.com wrote:
 On 18/02/2010 07:17, Jie Zhang wrote:
 We are trying to add a 16bit integer division library function for
 bfin port. I just found GCC didn't do integral promotions when calling
 library function.

 Is this expected?

  I wasn't aware of this myself, but it kind-of makes sense given the way that
 macros such as FUNCTION_ARG and INIT_CUMULATIVE_ARGS don't get passed any type
 info in the case of libcalls; I'm guessing here, but that would imply to me
 that all libcalls are effectively using unnamed stdargs-style arg passing.
 Not sure how to check this theory without extensively reading the source
 though.  I imagine it's done for efficiency, and there should always be
 libcall functions existing for the precise types you're passing?

I'm trying to use an existing division function as a library function.
This existing function does unsigned 16bit integral division and
expects both arguments have been zero extended to 32bit. It's a little
surprise for me that GCC does not do the promotion when calling
library function even TARGET_PROMOTE_PROTOTYPES is defined. I can
adjust the division function accordingly. But before I do that, I'd
like to know if library functions should follow the same convention as
the normal ones or not.


Jie


No integral promotions when calling library function?

2010-02-17 Thread Jie Zhang
We are trying to add a 16bit integer division library function for
bfin port. I just found GCC didn't do integral promotions when calling
library function.

For example, in function foo, I can assume both arguments are zero
extended from unsigned short to unsigned int.

extern unsigned short foo (unsigned short, unsigned short);
unsigned int a;
unsigned int b;
unsigned short bar ()
{
return foo ((unsigned short) a, (unsigned short) b);
}

But with the following code, I can't assume that the high halves of R0
and R1, which are the first two registers for argument passing, are
all zeros.

unsigned int a;
unsigned int b;
unsigned short bar ()
{
return (unsigned short) a / (unsigned short) b;
}

Is this expected?

Thanks,
Jie


Re: Jie Zhang appointed bfin maintainer

2010-02-07 Thread Jie Zhang

On 02/08/2010 08:53 AM, Gerald Pfeifer wrote:

It is my pleasure to announce that, also based on the recommendation of
Bernd as an existing maintainer, the steering committee has appointed
Jie Zhang maintainer of the bfin port.

Thanks for your contributions over the last five(?) years, Jie.


Yes. I have worked on Blackfin port for nearly 5 years with Bernd.


Please adjust the MAINTAINERS file accordingly, and Happy Hacking!


Thanks! I have updated the the MAINTAINERS file with the attached patch.


Jie

	* MAINTAINERS: Add myself as a maintainer for the bfin port.

Index: MAINTAINERS
===
--- MAINTAINERS	(revision 156592)
+++ MAINTAINERS	(working copy)
@@ -48,6 +48,7 @@
 avr port		Anatoly Sokolov		ae...@post.ru
 avr port		Eric Weddington		eric.wedding...@atmel.com
 bfin port		Bernd Schmidt		bernd.schm...@analog.com
+bfin port		Jie Zhang		jie.zh...@analog.com
 cris port		Hans-Peter Nilsson	h...@axis.com
 crx port		Pompapathi V Gadad 	pompapathi.v.ga...@nsc.com
 fr30 port		Nick Clifton		ni...@redhat.com
@@ -493,7 +494,6 @@
 Canqun Yang	can...@nudt.edu.cn
 Joey Ye		joey...@intel.com
 Kenneth Zadeck	zad...@naturalbridge.com
-Jie Zhang	jie.zh...@analog.com
 Shujing Zhao	pearly.z...@oracle.com
 Jon Ziegler	j...@apple.com
 Roman Zippel	zip...@linux-m68k.org


Re: where can find source snapshots of first GCC 4.5.0 ?

2010-01-04 Thread Jie Zhang
On Mon, Jan 4, 2010 at 8:04 PM, Bernd Roesch nospamn...@gmx.de wrote:
 Hi,

 Because of this regression,

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41311

 Problem is in m68k-elf too, but happen not with any older GCC as 4.5.0

 i want try out if the first GCC 4.5.0 snapshot
 have this Problem or not.

 The first GCC 4.5.0 i compile was in month 08.this have the Bug.
 But i find on the mirror sites
 only first snapshots now that are from month 10.

 So maybe somebody can post me a link to older versions of GCC 4.5.0

I would recommend using GCC git mirror and bisect to locate the source
of regression. It's very fast to switch between different revisions.


Jie


Re: df_changeable_flags use in combine.c

2010-01-04 Thread Jie Zhang

On 01/05/2010 07:12 AM, Matt wrote:

Hi,

I'm fixing some compiler errors when configuring with
--enable-build-with-cxx, and ran into a curious line of code that may
indicate a bug:

static unsigned int
rest_of_handle_combine (void)
{
int rebuild_jump_labels_after_combine;

df_set_flags (DF_LR_RUN_DCE + DF_DEFER_INSN_RESCAN);
// ...
}

The DF_* values are from the df_changeable_flags enum, whose values are
typically used in logical and/or operations for masking purposes. As
such, I'm guessing the author may have meant to do:
df_set_flags (DF_LR_RUN_DCE  DF_DEFER_INSN_RESCAN);


I think you meant |. I think + is same as | here.

And I didn't see this error when --enable-build-with-cxx for current 
trunk head. But I see other errors.



Jie


Re: MPC required in one week.

2009-12-27 Thread Jie Zhang

On 12/27/2009 03:05 PM, Silvius Rus wrote:

On the flip side, it's not necessarily easy to get it to work.  On my
build system, apt-get doesn't find it.  Downloading and installing the
.deb manually triggers 3 missing deps.


apt-get install libmpc-dev

libmpc-dev is already in squeeze and sid if you are using Debian.


Jie


Re: MPC required in one week.

2009-12-27 Thread Jie Zhang

On 12/01/2009 06:25 PM, Paolo Bonzini wrote:

On 11/30/2009 09:47 PM, Michael Witten wrote:

On Mon, Nov 30, 2009 at 12:04 AM, Kaveh R.
GHAZIgh...@caip.rutgers.edu wrote:

The patch which makes the MPC library a hard requirement for GCC
bootstrapping has been approved today.


Out of curiosity and ignorance: Why, specifically, is MPC going to be
a hard requirement?

On the prerequisites page, MPC is currently described with: Having
this library will enable additional optimizations on complex numbers.

Does that mean that such optimizations are now an important
requirement? or is MPC being used for something else?


They are a requirement for Fortran, but it's (much) simpler to do them
for all front-ends.


Actually the bug mentioned in 4.5 release page under MPC is a C bug.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30789

So it seems not only just for Fortran.


Jie


Re: Problem while configuring gcc3.2

2009-12-27 Thread Jie Zhang

Hi,

Please don't top reply.

On 12/28/2009 12:59 PM, Pardis Beikzadeh wrote:

Hi,

Also 'make bootstrap' doesn't work without running configure, so I'm
not sure what the recommended way mentioned in the email below
means.

The bootstrap in Jim's reply means, I think, building a minimal (only C 
front-end) gcc-3.2 first using gcc-3.4. Then you can use the minimal 
gcc-3.2 to build a full gcc-3.2.




If you build gcc-3.2 the recommended way, e.g. via bootstrap, then you won't 
run into this problem. The fortran front end will be built by the bootstrapped 
gcc-3.2 instead of gcc-3.4, and you won't get this error.

If you are building a cross, then you bootstrap a native gcc-3.2 first, and 
then use the native gcc-3.2 to build the cross gcc-3.2.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com




Jie


Re: Question on PR36873

2009-12-23 Thread Jie Zhang

On 12/23/2009 02:43 PM, Jie Zhang wrote:

Hi,

We just got a similar problem on Blackfin GCC recently. Let me take the
test code from the bug as an example:


I reduce the test case to a simpler one:

$ cat foo.c
unsigned int
foo (volatile unsigned short *p)
{
  return *p;
}

I the tree dump foo.c.126t.optimized, GCC refused to eliminate D.1256 
because the first statement contains a volatile operand:


  D.1256 ={v} *p;
  return (unsigned int) D.1256;

I'm not familiar with the trees. Is it possible to replace D.1256 and 
have something like below?


  return (unsigned int) {v} *p;

I experiment a little. It seems {v} will be lost in SSA name replacing 
during out of SSA transform. Can anyone pointed me if it's possible to 
do the replace but still keep {v}? Or I should find another way to do 
that? Or it's wrong to do this optimization?


Thanks,

Jie


Re: Question on PR36873

2009-12-23 Thread Jie Zhang

On 12/23/2009 06:12 PM, Dave Korn wrote:

Jie Zhang wrote:


typedef unsigned short u16;
typedef unsigned int u32;

u32 a(volatile u16* off) {
 return *off;
}



mingw32-gcc-4.3.0.exe -c -O2 -fomit-frame-pointer -mtune=core2 test.c

it produces:
_a:
0:   8b 44 24 04 mov0x4(%esp),%eax
4:   0f b7 00movzwl (%eax),%eax
7:   0f b7 c0movzwl %ax,%eax== The redundant insn
a:   c3  ret


   How does it look at the RTL level?  I wonder if this situation is similar to
the one being discussed in the other current thread Which optimizer should
remove redundant subreg of sign_extension?


With my native GCC on Debian AMD64 unstable, in t.c.128r.expand:

(insn 6 5 7 3 t.c:5 (set (reg:HI 58 [ D.1595 ])
(mem/v:HI (reg/v/f:DI 60 [ off ]) [2 S2 A16])) -1 (nil))

(insn 7 6 8 3 t.c:5 (set (reg:SI 61)
(zero_extend:SI (reg:HI 58 [ D.1595 ]))) -1 (nil))

In t.c.201r.shorten:

(insn:TI 6 3 7 t.c:5 (set (reg:HI 0 ax [orig:58 D.1595 ] [58])
(mem/v:HI (reg/v/f:DI 5 di [orig:60 off ] [60]) [2 S2 A16])) 53 
{*movhi_1} (expr_list:REG_DEAD (reg/v/f:DI 5 di [orig:60 off ] [60])

(nil)))

(insn:TI 7 6 18 t.c:5 (set (reg:SI 0 ax [orig:61 D.1595 ] [61])
(zero_extend:SI (reg:HI 0 ax [orig:58 D.1595 ] [58]))) 114 
{*zero_extendhisi2_movzwl} (nil))


There is a volatile flag for mem operand. If there is no such flag, I 
think one of RTL passes might combine them. It looks similar with the 
issue in the thread you mentioned. But the cause is different.



Regards,
Jie


Question on PR36873

2009-12-22 Thread Jie Zhang

Hi,

We just got a similar problem on Blackfin GCC recently. Let me take the 
test code from the bug as an example:


typedef unsigned short u16;
typedef unsigned int u32;

u32 a(volatile u16* off) {
return *off;
}

u32 b(u16* off) {
return *off;
}

compiled with
mingw32-gcc-4.3.0.exe -c -O2 -fomit-frame-pointer -mtune=core2 test.c

it produces:
 _a:
   0:   8b 44 24 04 mov0x4(%esp),%eax
   4:   0f b7 00movzwl (%eax),%eax
   7:   0f b7 c0movzwl %ax,%eax  == The redundant insn
   a:   c3  ret

0010 _b:
  10:   8b 44 24 04 mov0x4(%esp),%eax
  14:   0f b7 00movzwl (%eax),%eax
  17:   c3  ret

I don't understand Richard's comment. What do we not optimize volatile 
accesses in this test case. I know we cannot do many optimizations on 
volatile accesses, but I think it's OK to remove the redundant insn in 
this case. Could someone provide me a case in which we cannot remove it.



Thanks,
Jie


Re: GMP and GCC 4.3.2

2009-12-17 Thread Jie Zhang

On 12/18/2009 06:27 AM, Jean Christophe Beyler wrote:

Actually, I just finished updating my 4.3.2 to 4.3.3 and tested it and
I still have the same issue.

This seems to be a problem more than just 4.3.2.

Here is the test program:
#includestdio.h
#includegmp.h

int main() {
 mpz_t a,b;
 mpz_init_set_str(a, 100, 10); // program works with 10^9, but not
 // with 10^10 (10^20  2^64)
 mpz_init_set(b, a);
 mpz_mul(a, a, a);
 gmp_printf(first,  in GMP mpz_mul(a,a,a) with a=%Zd gives %Zd \n, b, a);
 mpz_set(b, a);
 mpz_mul(a, a, a);
 gmp_printf(second, in GMP mpz_mul(a,a,a) with a=%Zd gives %Zd \n, b, a);
 return 0;
}

We obtain:
first,  in GMP mpz_mul(a,a,a) with a=100 gives 1
second, in GMP mpz_mul(a,a,a) with a=1 gives
2254536013160540992915637663717291581095979492475463214517286840718852096

Which clearly is wrong for the second output.

This was tested with a 64 bit architecture. I know that with a 4.1.1
port of the compiler, I do not see this issue.

I will see if I can port it forward to see if I still see the problem
but it might be difficult to port from 4.3.2 to 4.4.2, I'm not sure
how many things have changed but I'm sure quite a bit !

This is the -v of my GCC, version 4.3.3:
Using built-in specs.
Target: myarch64-linux-elf
Configured with:
/home/beyler/myarch64/src/myarch64-gcc-4.3.2/configure
--target=myarch64-linux-elf
--with-headers=/home/beyler/myarch64/src/newlib-1.16.0/newlib/libc/include
--prefix=/home/beyler/myarch64/local --disable-nls
--enable-languages=c --with-newlib --disable-libssp
--with-mpfr=/home/beyler/myarch64/local
Thread model: single
gcc version 4.3.3 (GCC)

What's myarch64? I got the correct result for your test with vanilla 
gcc-4.3.2 and gmp-4.3.1 on Debian unstable AMD64.



Jie


Re: No .got section in ELF

2009-11-26 Thread Jie Zhang

On 11/26/2009 02:04 PM, yunfeng zhang wrote:

The result is the same

#includestdio.h

extern int g __attribute__((visibility(hidden)));
int g;

int foo(int a, int b)
{
 g = a + b;
 printf(%x, %x,g, foo);
 return g;
}

load and call `foo' in the library, an outputting (with vdso) is
 cc15bc, cc03fc
and open f.map
 0x15bc, 0x3fc

It shows Linux simply maps the library to memory *using* library segment layout.

Using e.cc to call it

#includeexception
#includetypeinfo
#includecstddef
#includedlfcn.h
#includestdio.h

int main(void)
{
 void* handle = dlopen(./f.so, RTLD_NOW);
 typedef int (*gso)(int, int);
 gso f;
 *(void**) (f) = dlsym(handle, foo);
 f(1, 2);
 return 0;
}


You got the bad test case. Please try the following:

$ cat f.c
#include stdio.h
int g;

int foo(int a, int b)
{
  g = a + b;
  printf(g = 0x%x, foo = 0x%x\n, g, foo);
  return g;
}

$ cat e.c
int g;
extern int foo(int a, int b);

int main(void)
{
  foo(1, 2);
  return 0;
}

$ gcc -shared -fPIC -Wl,-soname,./libf.so,-Map,f.map -o libf.so f.c
$ gcc -o e e.c -ldl -L. -lf
$ ./e
g = 0x600a30, foo = 0x294a2614

Then comment out the int g; in e.c. and do the same steps as above:

$ ./e
g = 0x58294948, foo = 0x58094614

You can see that C-A is *not* a constant. Your premise is wrong.


Jie


Re: Truncated history in viewvc

2009-09-16 Thread Jie Zhang

Dave Korn wrote:

Andrew Pinski wrote:

On Wed, Sep 16, 2009 at 12:19 AM, Dave Korn
dave.korn.cyg...@googlemail.com wrote:

   Good morning all!

 Is there some reason that I don't know about (e.g. limiting the load on the
server) why the revision log views of files in our viewvc setup would be
heavily truncated?

The issue comes down to the trunk had be accidently deleted and viewvc
does not handle that.

-- Pinski


  Ohhh, yes, I remember that happening.  Oh well, don't think there's much we
can do about that, except perhaps hope a future version can deal with it.
Might be worth our while filing an enhancement request at the viewvc project,
but that's it really.  Thanks for the explanation.

I have a gcc svn repository mirror on my hard disk. The svn history of 
varasm.c in viewvc looks good. The oldest entry is


Revision 281 - (view) (download) (as text) (annotate) - [select for diffs]
Added Thu Feb 6 00:04:16 1992 UTC (17 years, 7 months ago) by rms
File length: 77149 byte(s)

Initial revision

I'm using ViewVC 1.0.5-0.2 from Debian AMD64 unstable. I noticed that 
the ViewVC version is 1.1.2 on gcc.gnu.org. Maybe it's a regression of 
ViewVC.



Jie



Re: Stuck master branch in git mirror

2009-09-15 Thread Jie Zhang

Andreas Schwab wrote:

It looks like the master branch of git://gcc.gnu.org/git/gcc hasn't been
updated since 3 weeks (trunk is still ok).


Same here. I now use trunk instead.


Jie


Re: libmudflap and emutls question

2009-01-08 Thread Jie Zhang

Jakub Jelinek wrote:

On Wed, Jan 07, 2009 at 11:38:55AM +0100, Paolo Bonzini wrote:

Which version of gcc did you use? gcc 4.1 (maybe and 4.2) will report
error. But gcc 4.3 compiles OK. I tested using x86_64 native gcc from
Debian unstable.  __emutls_get_address is defined in libgcc even the
target has real TLS.

Uff... not my day.  I used 4.2 (emutls was posted in 4.2 time but
committed in 4.3 only).  But I didn't think of the simplest solution:
use greps together with strings(1): strings ./conftest | grep
__emutls_get_address.


I'd say much better would be just to grep assembly.
See e.g. libffi/configure.ac's libffi_cv_hidden_visibility_attribute test.

This is what I was looking for. Thanks! The updated patch is attached. 
Is it OK now?



Jie


	libmudflap/
	* mf-impl.h (__mf_get_state, __mf_set_state): Don't use
	__thread when TLS support is emulated.
	* mf-hooks3.c (__mf_get_state, __mf_set_state): Likewise.
	* mf-runtime.c (__mf_state_1): Likewise.
	* configure.ac: Use GCC_CHECK_EMUTLS.
	* configure: Regenerate.
	* config.h.in: Regenerate.

	config/
	* tls.m4 (GCC_CHECK_EMUTLS): Define.


Index: libmudflap/mf-impl.h
===
--- libmudflap/mf-impl.h	(revision 143076)
+++ libmudflap/mf-impl.h	(working copy)
@@ -244,7 +244,7 @@ extern pthread_mutex_t __mf_biglock;
 #define UNLOCKTH() do {} while (0)
 #endif
 
-#if defined(LIBMUDFLAPTH)  !defined(HAVE_TLS)
+#if defined(LIBMUDFLAPTH)  (!defined(HAVE_TLS) || defined(USE_EMUTLS))
 extern enum __mf_state_enum __mf_get_state (void);
 extern void __mf_set_state (enum __mf_state_enum);
 #else
Index: libmudflap/mf-hooks3.c
===
--- libmudflap/mf-hooks3.c	(revision 143076)
+++ libmudflap/mf-hooks3.c	(working copy)
@@ -78,7 +78,7 @@ DECLARE(int, pthread_create, pthread_t *
 /* Multithreading support hooks.  */
 
 
-#ifndef HAVE_TLS
+#if !defined(HAVE_TLS) || defined(USE_EMUTLS)
 /* We don't have TLS.  Ordinarily we could use pthread keys, but since we're
commandeering malloc/free that presents a few problems.  The first is that
we'll recurse from __mf_get_state to pthread_setspecific to malloc back to
@@ -217,7 +217,7 @@ __mf_pthread_cleanup (void *arg)
   if (__mf_opts.heur_std_data)
 __mf_unregister (errno, sizeof (errno), __MF_TYPE_GUESS);
 
-#ifndef HAVE_TLS
+#if !defined(HAVE_TLS) || defined(USE_EMUTLS)
   struct mf_thread_data *data = __mf_find_threadinfo (0);
   if (data)
 data-used_p = 0;
Index: libmudflap/configure.ac
===
--- libmudflap/configure.ac	(revision 143076)
+++ libmudflap/configure.ac	(working copy)
@@ -265,6 +265,7 @@ fi
 
 # See if we support thread-local storage.
 GCC_CHECK_TLS
+GCC_CHECK_EMUTLS
 
 AC_CONFIG_FILES([Makefile testsuite/Makefile testsuite/mfconfig.exp])
 AC_OUTPUT
Index: libmudflap/mf-runtime.c
===
--- libmudflap/mf-runtime.c	(revision 143076)
+++ libmudflap/mf-runtime.c	(working copy)
@@ -178,7 +178,7 @@ struct __mf_options __mf_opts;
 int __mf_starting_p = 1;
 
 #ifdef LIBMUDFLAPTH
-#ifdef HAVE_TLS
+#if defined(HAVE_TLS)  !defined(USE_EMUTLS)
 __thread enum __mf_state_enum __mf_state_1 = reentrant;
 #endif
 #else
Index: config/tls.m4
===
--- config/tls.m4	(revision 143076)
+++ config/tls.m4	(working copy)
@@ -86,3 +86,21 @@ AC_DEFUN([GCC_CHECK_CC_TLS], [
 AC_DEFINE(HAVE_CC_TLS, 1,
 	  [Define to 1 if the target assembler supports thread-local storage.])
   fi])
+
+dnl Check whether TLS is emulated.
+AC_DEFUN([GCC_CHECK_EMUTLS], [
+  AC_CACHE_CHECK([whether the thread-local storage support is from emutls],
+  		 gcc_cv_use_emutls, [
+gcc_cv_use_emutls=no
+echo '__thread int a; int b; int main() { return a = b; }'  conftest.c
+if AC_TRY_COMMAND(${CC-cc} -Werror -S -o conftest.s conftest.c 1AS_MESSAGE_LOG_FD); then
+  if grep __emutls_get_address conftest.s  /dev/null; then
+	gcc_cv_use_emutls=yes
+  fi
+fi
+rm -f conftest.*
+])
+  if test $gcc_cv_use_emutls = yes ; then
+AC_DEFINE(USE_EMUTLS, 1,
+  	  [Define to 1 if the target use emutls for thread-local storage.])
+  fi])


Re: libmudflap and emutls question

2009-01-07 Thread Jie Zhang

Hi Paolo,

Thanks for your review!

Paolo Bonzini wrote:

+AC_COMPILE_IFELSE([__thread int a; int b; int main() { return a = b; }],
+ [if grep __emutls_get_address conftest.$ac_objext 
/dev/null ; then


grepping in a binary file is not portable.  If this works it would be
better:

AC_COMPILE_IFELSE([[__thread int a; int b;
extern void __emutls_get_address();
int main() {
  __emutls_get_address();
  return a = b;
}]],
[gcc_cv_use_emutls=yes],
[gcc_cv_use_emutls=no])


This does not work. For x86_64 native gcc, the compiler output is

$ gcc -c test.c
test.c:2: warning: conflicting types for built-in function 
‘__emutls_get_address’


For Blackfin gcc, the compiler output is

$ bfin-uclinux-gcc -c test.c
test.c:2: warning: conflicting types for built-in function 
‘__emutls_get_address’


Both are same.

I thought about using int __emutls_v.a; to trigger duplicate 
definitions, but C don't allow dot in symbol name. If there is something 
like AC_COMPILE_IFELSE but output assembly file instead of object file, 
it will be the best choice. But I don't know if it exists. There is an 
existing practice in autoconf (c.m4), which greps object file to find 
out endianness. So I think grep object file might be acceptable.



Otherwise, the configury parts look fine to me.




Regards,
Jie


Re: libmudflap and emutls question

2009-01-05 Thread Jie Zhang

Hi Frank,

Frank Ch. Eigler wrote:

Jie Zhang jie.zh...@analog.com writes:


To break the recursive loop, one solution is to force emutls to call
the real calloc. [...]


If it were acceptable to change emutls on account of mudflap, this
sort of thing could work.  Other alternatives would include having
emutls define something in addition to HAVE_TLS that activates the
!HAVE_TLS implementation in libmudflap/mf-hooks3.c.

Thanks for your help! How about the attached patch, which follows your 
advice?



Regards,
Jie



	libmudflap/
	* mf-impl.h (__mf_get_state, __mf_set_state): Don't use
	__thread when TLS support is emulated.
	* mf-hooks3.c (__mf_get_state, __mf_set_state): Likewise.
	* mf-runtime.c (__mf_state_1): Likewise.
	* configure.ac: Use GCC_CHECK_EMUTLS.
	* configure: Regenerate.
	* config.h.in: Regenerate.

	config/
	* tls.m4 (GCC_CHECK_EMUTLS): Define.


Index: libmudflap/mf-impl.h
===
--- libmudflap/mf-impl.h	(revision 143074)
+++ libmudflap/mf-impl.h	(working copy)
@@ -244,7 +244,7 @@
 #define UNLOCKTH() do {} while (0)
 #endif
 
-#if defined(LIBMUDFLAPTH)  !defined(HAVE_TLS)
+#if defined(LIBMUDFLAPTH)  (!defined(HAVE_TLS) || defined(USE_EMUTLS))
 extern enum __mf_state_enum __mf_get_state (void);
 extern void __mf_set_state (enum __mf_state_enum);
 #else
Index: libmudflap/mf-hooks3.c
===
--- libmudflap/mf-hooks3.c	(revision 143074)
+++ libmudflap/mf-hooks3.c	(working copy)
@@ -78,7 +78,7 @@
 /* Multithreading support hooks.  */
 
 
-#ifndef HAVE_TLS
+#if !defined(HAVE_TLS) || defined(USE_EMUTLS)
 /* We don't have TLS.  Ordinarily we could use pthread keys, but since we're
commandeering malloc/free that presents a few problems.  The first is that
we'll recurse from __mf_get_state to pthread_setspecific to malloc back to
@@ -217,7 +217,7 @@
   if (__mf_opts.heur_std_data)
 __mf_unregister (errno, sizeof (errno), __MF_TYPE_GUESS);
 
-#ifndef HAVE_TLS
+#if !defined(HAVE_TLS) || defined(USE_EMUTLS)
   struct mf_thread_data *data = __mf_find_threadinfo (0);
   if (data)
 data-used_p = 0;
Index: libmudflap/configure.ac
===
--- libmudflap/configure.ac	(revision 143074)
+++ libmudflap/configure.ac	(working copy)
@@ -265,6 +265,7 @@
 
 # See if we support thread-local storage.
 GCC_CHECK_TLS
+GCC_CHECK_EMUTLS
 
 AC_CONFIG_FILES([Makefile testsuite/Makefile testsuite/mfconfig.exp])
 AC_OUTPUT
Index: libmudflap/mf-runtime.c
===
--- libmudflap/mf-runtime.c	(revision 143074)
+++ libmudflap/mf-runtime.c	(working copy)
@@ -178,7 +178,7 @@
 int __mf_starting_p = 1;
 
 #ifdef LIBMUDFLAPTH
-#ifdef HAVE_TLS
+#if defined(HAVE_TLS)  !defined(USE_EMUTLS)
 __thread enum __mf_state_enum __mf_state_1 = reentrant;
 #endif
 #else
Index: config/tls.m4
===
--- config/tls.m4	(revision 143074)
+++ config/tls.m4	(working copy)
@@ -86,3 +86,20 @@
 AC_DEFINE(HAVE_CC_TLS, 1,
 	  [Define to 1 if the target assembler supports thread-local storage.])
   fi])
+
+dnl Check whether TLS is emulated.
+AC_DEFUN([GCC_CHECK_EMUTLS], [
+  AC_CACHE_CHECK([whether the thread-local storage support is from emutls],
+  		 gcc_cv_use_emutls, [
+AC_COMPILE_IFELSE([__thread int a; int b; int main() { return a = b; }],
+		  [if grep __emutls_get_address conftest.$ac_objext /dev/null ; then
+			 gcc_cv_use_emutls=yes
+		   else
+			 gcc_cv_use_emutls=no
+		   fi
+		  ], [gcc_cv_use_emutls=no])]
+)
+  if test $gcc_cv_use_emutls = yes ; then
+AC_DEFINE(USE_EMUTLS, 1,
+  	  [Define to 1 if the target use emutls for thread-local storage.])
+  fi])


libmudflap and emutls question

2009-01-04 Thread Jie Zhang

Hi,

I encountered a recursive call problem between libmudflap and emutls 
when testing libmudflap for Blackfin. But I think this issue affects all 
targets without TLS.


One libmudflap test case in the testsuite calls __wrap_calloc. In 
__wrap_calloc, __mf_state_1 is looked by __mf_get_state to see if it's 
in_malloc, reentrant or active. With emutls, HAVE_TLS is defined as 1 
now. So __mf_state_1 has type of __thread. When emults tries to simulate 
TLS for __mf_state_1, it recursively calls __wrap_calloc in 
__emutls_get_address. To break the recursive loop, one solution is to 
force emutls to call the real calloc. But I don't know how to do this. 
Could someone help me on this?


Thanks!


Jie


Re: Set environment variable on remote target

2008-07-18 Thread Jie Zhang

Andreas Schwab wrote:

Jie Zhang [EMAIL PROTECTED] writes:


So we have to use single quotes. The updated patch is attached.


This will break if the value can contain single quotes.

How about using double quotes but escaping , \, $, and ` using 
backslash? The patch is attached.



Jie
diff --git a/lib/rsh.exp b/lib/rsh.exp
index 1a207a8..d846887 100644
--- a/lib/rsh.exp
+++ b/lib/rsh.exp
@@ -225,6 +225,7 @@ proc rsh_upload {desthost srcfile destfile} {
 #
 proc rsh_exec { boardname program pargs inp outp } {
 global timeout
+global remote_env
 
 verbose Executing $boardname:$program $pargs  $inp
 
@@ -261,7 +262,14 @@ proc rsh_exec { boardname program pargs inp outp } {
 	set inp /dev/null
 }
 
-set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]
+set remote_envs 
+foreach envvar [array names remote_env] {
+	set tmp_env $remote_env($envvar)
+	# Escape , \, $, and `, which cannot be protected by double quotes.
+	regsub -all (\[\$`]) $tmp_env \\1 tmp_env
+	set remote_envs $remote_envs $envvar=\$tmp_env\
+}
+set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$remote_envs $program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]
 set status [lindex $ret 0]
 set output [lindex $ret 1]
 
diff --git a/lib/utils.exp b/lib/utils.exp
index 6c9ff98..6325dd8 100644
--- a/lib/utils.exp
+++ b/lib/utils.exp
@@ -414,3 +414,12 @@ proc getenv { var } {
 }
 }
 
+#
+# Set an environment variable remotely
+#
+proc remote_setenv { var val } {
+global remote_env
+
+set remote_env($var) $val
+}
+


Re: Set environment variable on remote target

2008-07-17 Thread Jie Zhang

Andreas Schwab wrote:

Jie Zhang [EMAIL PROTECTED] writes:


Andreas Schwab wrote:

Jie Zhang [EMAIL PROTECTED] writes:


@@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } {
set inp /dev/null
 }
 -set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program
$pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]
+set remote_envs 
+foreach envvar [array names remote_env] {
+   set remote_envs $remote_envs $envvar=$remote_env($envvar)

That needs to do proper quoting to protect shell meta characters.


Thanks for pointing out this. A new patch is attached. Is the quoting right?


That won't protect all meta characters.  Inside double quotes the dollar
sign, backslash and backquote are still special.


So we have to use single quotes. The updated patch is attached.

Thanks,
Jie
diff --git a/lib/rsh.exp b/lib/rsh.exp
index 1a207a8..94122e8 100644
--- a/lib/rsh.exp
+++ b/lib/rsh.exp
@@ -225,6 +225,7 @@ proc rsh_upload {desthost srcfile destfile} {
 #
 proc rsh_exec { boardname program pargs inp outp } {
 global timeout
+global remote_env
 
 verbose Executing $boardname:$program $pargs  $inp
 
@@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } {
 	set inp /dev/null
 }
 
-set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]
+set remote_envs 
+foreach envvar [array names remote_env] {
+	set remote_envs $remote_envs $envvar='$remote_env($envvar)'
+}
+set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$remote_envs $program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]
 set status [lindex $ret 0]
 set output [lindex $ret 1]
 
diff --git a/lib/utils.exp b/lib/utils.exp
index 6c9ff98..6325dd8 100644
--- a/lib/utils.exp
+++ b/lib/utils.exp
@@ -414,3 +414,12 @@ proc getenv { var } {
 }
 }
 
+#
+# Set an environment variable remotely
+#
+proc remote_setenv { var val } {
+global remote_env
+
+set remote_env($var) $val
+}
+


Set environment variable on remote target

2008-07-16 Thread Jie Zhang
libmudflap tests set a environment MUDFLAP_OPTIONS=-viol-segv before 
testing such that violations are promoted to SIGSEGV signals in testing. 
Otherwise, the exit value would be 0 even the test has violations. 
libmudflap testsuite depends on the exit value of tests to decide if the 
test PASS or FAIL. Setting MUDFLAP_OPTIONS is done in DejaGNU by


  setenv MUDFLAP_OPTIONS -viol-segv

which works fine on native testing. But when doing remote cross testing, 
setenv does not help. I cannot find existing mechanism in DejaGNU. So I 
want to use a global array like remote_env. If remote cross testing, add 
the environment variable in this array. Then set the environment 
variables according to the array when remote execute test case. I wrote 
a draft patch show what I means, which is attached. In mudflap 
testsuite, replace each setenv with


if { ![is_remote target] } {
setenv MUDFLAP_OPTIONS -viol-segv
} else {
remote_setenv MUDFLAP_OPTIONS -viol-segv
}

Is it the right way to do this, or is there existing method I can use 
but I missed?



Thanks,

Jie
diff --git a/lib/rsh.exp b/lib/rsh.exp
index 1a207a8..df3b3d1 100644
--- a/lib/rsh.exp
+++ b/lib/rsh.exp
@@ -225,6 +225,7 @@ proc rsh_upload {desthost srcfile destfile} {
 #
 proc rsh_exec { boardname program pargs inp outp } {
 global timeout
+global remote_env
 
 verbose Executing $boardname:$program $pargs  $inp
 
@@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } {
 	set inp /dev/null
 }
 
-set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]
+set remote_envs 
+foreach envvar [array names remote_env] {
+	set remote_envs $remote_envs $envvar=$remote_env($envvar)
+}
+set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$remote_envs $program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]
 set status [lindex $ret 0]
 set output [lindex $ret 1]
 
diff --git a/lib/utils.exp b/lib/utils.exp
index 6c9ff98..8523973 100644
--- a/lib/utils.exp
+++ b/lib/utils.exp
@@ -414,3 +414,33 @@ proc getenv { var } {
 }
 }
 
+#
+# Set an environment variable remotely
+#
+proc remote_setenv { var val } {
+global remote_env
+
+set remote_env($var) $val
+}
+
+#
+# Unset an environment variable remotely
+#
+proc remote_unsetenv { var } {
+global remote_env
+unset remote_env($var)
+}
+
+#
+# Get a value from an environment variable remotely
+#
+proc remote_getenv { var } {
+global remote_env
+
+if {[info exists remote_env($var)]} {
+	return $remote_env($var)
+} else {
+	return 
+}
+}
+


Re: Set environment variable on remote target

2008-07-16 Thread Jie Zhang

Andreas Schwab wrote:

Jie Zhang [EMAIL PROTECTED] writes:


@@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } {
set inp /dev/null
 }
 
-set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]

+set remote_envs 
+foreach envvar [array names remote_env] {
+   set remote_envs $remote_envs $envvar=$remote_env($envvar)


That needs to do proper quoting to protect shell meta characters.


Thanks for pointing out this. A new patch is attached. Is the quoting right?

I also dropped remote_getenv from the new patch, since I just realized 
it cannot get remote environment variable indeed. remote_unsetenv was 
dropped for the same reason.


The patch for gcc is also attached for review.


Jie


diff --git a/lib/rsh.exp b/lib/rsh.exp
index 1a207a8..13be541 100644
--- a/lib/rsh.exp
+++ b/lib/rsh.exp
@@ -225,6 +225,7 @@ proc rsh_upload {desthost srcfile destfile} {
 #
 proc rsh_exec { boardname program pargs inp outp } {
 global timeout
+global remote_env
 
 verbose Executing $boardname:$program $pargs  $inp
 
@@ -261,7 +262,11 @@ proc rsh_exec { boardname program pargs inp outp } {
 	set inp /dev/null
 }
 
-set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]
+set remote_envs 
+foreach envvar [array names remote_env] {
+	set remote_envs $remote_envs $envvar=\$remote_env($envvar)\
+}
+set ret [local_exec $RSH $rsh_useropts $hostname sh -c '$remote_envs $program $pargs \\; echo XYZ\\\${?}ZYX' $inp $outp $timeout]
 set status [lindex $ret 0]
 set output [lindex $ret 1]
 
diff --git a/lib/utils.exp b/lib/utils.exp
index 6c9ff98..6325dd8 100644
--- a/lib/utils.exp
+++ b/lib/utils.exp
@@ -414,3 +414,12 @@ proc getenv { var } {
 }
 }
 
+#
+# Set an environment variable remotely
+#
+proc remote_setenv { var val } {
+global remote_env
+
+set remote_env($var) $val
+}
+
Index: testsuite/libmudflap.c/cfrags.exp
===
--- testsuite/libmudflap.c/cfrags.exp	(revision 136236)
+++ testsuite/libmudflap.c/cfrags.exp	(working copy)
@@ -13,7 +13,11 @@
 ${srcdir}/libmudflap.c/hook*.c \
 ${srcdir}/libmudflap.c/pass*.c]] {
 	set bsrc [file tail $srcfile]
-	setenv MUDFLAP_OPTIONS -viol-segv
+	if { ![is_remote target] } {
+	setenv MUDFLAP_OPTIONS -viol-segv
+	} else {
+	remote_setenv MUDFLAP_OPTIONS -viol-segv
+	}
 	dg-runtest $srcfile $flags -fmudflap -lmudflap
 }
 }
Index: testsuite/libmudflap.c/externs.exp
===
--- testsuite/libmudflap.c/externs.exp	(revision 136236)
+++ testsuite/libmudflap.c/externs.exp	(working copy)
@@ -23,7 +23,11 @@
 set test externs-21 linkage ${flags}
 if [string match  $l3]  { pass $test } { fail $test }
 
-setenv MUDFLAP_OPTIONS -viol-segv
+if { ![is_remote target] } {
+	setenv MUDFLAP_OPTIONS -viol-segv
+} else {
+	remote_setenv MUDFLAP_OPTIONS -viol-segv
+}
 
 remote_spawn host ./externs-12.exe
 set l5 [remote_wait host 10]
Index: testsuite/libmudflap.c++/ctors.exp
===
--- testsuite/libmudflap.c++/ctors.exp	(revision 136236)
+++ testsuite/libmudflap.c++/ctors.exp	(working copy)
@@ -28,7 +28,11 @@
 set test ctors-21 linkage ${flags}
 if [string match  $l3]  { pass $test } { fail $test }
 
-setenv MUDFLAP_OPTIONS -viol-segv
+if { ![is_remote target] } {
+	setenv MUDFLAP_OPTIONS -viol-segv
+} else {
+	remote_setenv MUDFLAP_OPTIONS -viol-segv
+}
 
 remote_spawn host ./ctors-12.exe
 set l5 [remote_wait host 10]
Index: testsuite/libmudflap.c++/c++frags.exp
===
--- testsuite/libmudflap.c++/c++frags.exp	(revision 136236)
+++ testsuite/libmudflap.c++/c++frags.exp	(working copy)
@@ -14,7 +14,11 @@
 foreach flags $MUDFLAP_FLAGS {
 foreach srcfile [lsort [glob -nocomplain ${srcdir}/libmudflap.c++/*frag.cxx]] {
 	set bsrc [file tail $srcfile]
-	setenv MUDFLAP_OPTIONS -viol-segv
+	if { ![is_remote target] } {
+	setenv MUDFLAP_OPTIONS -viol-segv
+	} else {
+	remote_setenv MUDFLAP_OPTIONS -viol-segv
+	}
 	dg-runtest $srcfile $flags -fmudflap -lmudflap
 }
 }
Index: testsuite/libmudflap.cth/cthfrags.exp
===
--- testsuite/libmudflap.cth/cthfrags.exp	(revision 136236)
+++ testsuite/libmudflap.cth/cthfrags.exp	(working copy)
@@ -9,7 +9,11 @@
 foreach flags $MUDFLAP_FLAGS {
 foreach srcfile [lsort [glob -nocomplain ${srcdir}/libmudflap.cth/*.c]] {
 	set bsrc [file tail $srcfile]
-	setenv MUDFLAP_OPTIONS -viol-segv
+	if { ![is_remote target] } {
+	setenv MUDFLAP_OPTIONS -viol-segv
+	} else {
+	remote_setenv MUDFLAP_OPTIONS -viol-segv

Link tests after GCC_NO_EXECUTABLES

2007-09-18 Thread Jie Zhang

libstdc++ tries to avoid link tests when configured with newlib. But I
saw this when working on bfin port gcc:

checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... no
checking how to hardcode library paths into programs... immediate
checking for shl_load... configure: error: Link tests are not allowed 
after GCC_NO_EXECUTABLES.

make[1]: *** [configure-target-libstdc++-v3] Error 1
make[1]: Leaving directory 
`/home/jie/blackfin-sources/build43/gcc_build-4.3'

make: *** [all] Error 2

I got this when building bfin-elf-gcc with patched gcc and newlib in the
same tree. I found LT_SYS_DLOPEN_SELF does link tests for shl_load after
GCC_NO_EXECUTABLES. The call path is

libstdc++-v3/configure.ac AM_PROG_LIBTOOL - libtool.m4 LT_INIT -
_LT_SETUP - _LT_LANG_C_CONFIG - LT_SYS_DLOPEN_SELF

How about the patch below, which uses LT_SYS_DLOPEN_SELF only when not
cross compiling.


Jie


* libtool.m4 (_LT_LANG_C_CONFIG): Only use LT_SYS_DLOPEN_SELF
when not cross compiling.

Index: libtool.m4
===
--- libtool.m4  (revision 128569)
+++ libtool.m4  (working copy)
@@ -5117,7 +5117,9 @@
   _LT_LINKER_SHLIBS($1)
   _LT_SYS_DYNAMIC_LINKER($1)
   _LT_LINKER_HARDCODE_LIBPATH($1)
-  LT_SYS_DLOPEN_SELF
+  if test $cross_compiling = no; then
+LT_SYS_DLOPEN_SELF
+  fi
   _LT_CMD_STRIPLIB

   # Report which library types will actually be built




Re: Link tests after GCC_NO_EXECUTABLES

2007-09-18 Thread Jie Zhang

Bernd Schmidt wrote:

Jie Zhang wrote:

libstdc++ tries to avoid link tests when configured with newlib. But I
saw this when working on bfin port gcc:

checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... no
checking how to hardcode library paths into programs... immediate
checking for shl_load... configure: error: Link tests are not allowed 
after GCC_NO_EXECUTABLES.

make[1]: *** [configure-target-libstdc++-v3] Error 1
make[1]: Leaving directory 
`/home/jie/blackfin-sources/build43/gcc_build-4.3'

make: *** [all] Error 2

I got this when building bfin-elf-gcc with patched gcc and newlib in the
same tree. I found LT_SYS_DLOPEN_SELF does link tests for shl_load after
GCC_NO_EXECUTABLES. The call path is

libstdc++-v3/configure.ac AM_PROG_LIBTOOL - libtool.m4 LT_INIT -
_LT_SETUP - _LT_LANG_C_CONFIG - LT_SYS_DLOPEN_SELF


I saw something similar, but managed to make it go away.  I don't 
remember how exactly (this kind of issue seems to happen to me all the 
time, for different reasons each time), but I think the actual problem 
was that you need to ensure that gcc_no_link doesn't get set.  That's a 
test somewhat earlier in configure.



But by design if gcc_no_link = no, link tests should be avoided.

Jie


Re: Link tests after GCC_NO_EXECUTABLES

2007-09-18 Thread Jie Zhang

Bernd Schmidt wrote:

Jie Zhang wrote:

But by design if gcc_no_link = no, link tests should be avoided.


??? I would have thought gcc_no_link = yes means link tests are avoided.


Oops, I meant gcc_no_link = yes.

Jie


Re: Link tests after GCC_NO_EXECUTABLES

2007-09-18 Thread Jie Zhang

Rask Ingemann Lambertsen wrote:

On Tue, Sep 18, 2007 at 07:55:45PM +0800, Jie Zhang wrote:

libstdc++ tries to avoid link tests when configured with newlib. But I
saw this when working on bfin port gcc:



From config.log:

/home/rask/build/gcc-bfin-unknown-elf/gcc/../ld/ld-new:
cannot open linker script file bf532.ld: No such file or directory

$ grep -F -e bf532.ld gcc/config/bfin/*
gcc/config/bfin/elf.h:%{!T*:%{!msim:%{mcpu=bf531:-Tbf531.ld}%{mcpu=bf532:-Tbf532.ld}
 \
gcc/config/bfin/elf.h:%{!mcpu=*:-Tbf532.ld}}}

The file bf532.ld is nowhere to be found in gcc or newlib/libgloss.

I have not pushed out our recent newlib/libgloss changes to upstream 
yet. Currently you could get latest blackfin port newlib/libgloss from


http://blackfin.uclinux.org/gf/project/toolchain/scmsvn

But if it cannot find bf532.ld, it should avoid further link tests.


Jie


Re: Link tests after GCC_NO_EXECUTABLES

2007-09-18 Thread Jie Zhang

Daniel Jacobowitz wrote:

On Tue, Sep 18, 2007 at 03:27:18PM +0200, Bernd Schmidt wrote:

Jie Zhang wrote:

Bernd Schmidt wrote:

Jie Zhang wrote:

But by design if gcc_no_link = no, link tests should be avoided.

??? I would have thought gcc_no_link = yes means link tests are avoided.


Oops, I meant gcc_no_link = yes.
Stupid double negatives.  Okay, so then your problem is that gcc_no_link=yes.  
Find out why it's setting that.


It always does for newlib.  The libtool tests are relatively recent
(from some recent autotools upgrade).


Yes, It was added by

http://sourceware.org/ml/binutils/2007-05/msg00247.html


Jie



Re: Link tests after GCC_NO_EXECUTABLES

2007-09-18 Thread Jie Zhang

Bernd Schmidt wrote:

Jie Zhang wrote:

Bernd Schmidt wrote:

Jie Zhang wrote:

But by design if gcc_no_link = no, link tests should be avoided.


??? I would have thought gcc_no_link = yes means link tests are avoided.


Oops, I meant gcc_no_link = yes.


Stupid double negatives.  Okay, so then your problem is that 
gcc_no_link=yes.  Find out why it's setting that.


bfin-elf-gcc -mfdpic failed to link a simple test case because code is 
put into L1 instruction sram and data is put into L1 data sram, but 
Blackfin immediate offset load instruction cannot access GOT since the 
gap between instruction sram and data sram is too large. Using -msim as 
default will pass this test case and build gcc without problem but I 
would like bfin-elf-gcc target hardware board by default. Use -fPIC as 
default is not good, since -fpic is enough for any real applications. So 
I would like to avoid link test for shl_load when GCC_NO_EXECUTABLES.



Jie


Re: Link tests after GCC_NO_EXECUTABLES

2007-09-18 Thread Jie Zhang

Bernd Schmidt wrote:

Jie Zhang wrote:
bfin-elf-gcc -mfdpic failed to link a simple test case because code is 
put into L1 instruction sram and data is put into L1 data sram, but 
Blackfin immediate offset load instruction cannot access GOT since the 
gap between instruction sram and data sram is too large. Using -msim 
as default will pass this test case and build gcc without problem but 
I would like bfin-elf-gcc target hardware board by default.


Any chance we could target it in such a way as to not put everything in 
L1 by default?  I think it's stupid to have configure tests failing for 
such a reason.


But then we need add sdram initialization code into crt files, which is 
usually provided by applications when needed.



Jie



Re: Link tests after GCC_NO_EXECUTABLES

2007-09-18 Thread Jie Zhang

Rask Ingemann Lambertsen wrote:

/home/rask/build/gcc-bfin-unknown-elf/gcc/../ld/ld-new:
crt532.o: No such file: No such file or directory


   I sorted that out by using your config/bfin/elf.h, but there's something
weird. The first time configure runs, it will complain about
GCC_NO_EXECUTABLES but there's no (obvious) clue as to why in config.log. If
I run make again, it begins to build libstdc++ but fails with this:

Making all in libsupc++
make[4]: Entering directory 
`/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/libsupc++'
/bin/sh ../libtool --tag CXX --tag disable-shared --mode=compile 
/home/rask/build/gcc-bfin-unknown-elf/./gcc/xgcc -shared-libgcc 
-B/home/rask/build/gcc-bfin-unknown-elf/./gcc -nostdinc++ 
-L/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/src 
-L/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/src/.libs 
-nostdinc -B/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/newlib/ 
-isystem 
/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/newlib/targ-include 
-isystem /n/12/rask/src/all/newlib/libc/include 
-B/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libgloss/bfin 
-L/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libgloss/libnosys 
-L/n/12/rask/src/all/libgloss/bfin -B/usr/local/bfin-unknown-elf/bin/ 
-B/usr/local/bfin-unknown-elf/lib/ -isystem /usr/local/bfin-unknown-elf/include 
-isystem /usr/local/bfin-unknown-elf/sys-include 
-L/home/rask/build/gcc-bfin-unknown-elf/./ld 
-I/n/12/rask/src/all/libstdc++-v3/../gcc -I/home/r

ask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/include/bfin-unknown-elf
 -I/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/include 
-I/n/12/rask/src/all/libstdc++-v3/libsupc++  -fno-implicit-templates  -Wall 
-Wextra -Wwrite-strings -Wcast-qual  -fdiagnostics-show-location=once  
-ffunction-sections -fdata-sections  -g -O2-c -o array_type_info.lo 
/n/12/rask/src/all/libstdc++-v3/libsupc++/array_type_info.cc

/bin/sh: ../libtool: No such file or directory
make[4]: *** [array_type_info.lo] Error 127
make[4]: Leaving directory 
`/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3/libsupc++'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory 
`/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3'
make[2]: *** [all] Error 2
make[2]: Leaving directory 
`/home/rask/build/gcc-bfin-unknown-elf/bfin-unknown-elf/libstdc++-v3'
make[1]: *** [all-target-libstdc++-v3] Error 2
make[1]: Leaving directory `/home/rask/build/gcc-bfin-unknown-elf'
make: *** [all] Error 2

I don't know why this happens to bfin and not to the other newlib targets.

I guess it might be caused by different multilib settings in our local 
(not FSF) newlib and FSF gcc. I have committed a patch in FSF gcc which 
makes FSF gcc use the same multilib setting as our local gcc. Sorry 
about that. Please try again.



Jie



Re: GCC 4.3.0 Status Report (2007-08-09)

2007-08-24 Thread Jie Zhang

Jie Zhang wrote:

On 8/10/07, Mark Mitchell [EMAIL PROTECTED] wrote:

Are there any folks out there who have projects for Stage 1 or Stage 2
that they are having trouble getting reviewed?  Any comments
re. timing for Stage 3?


I have many bfin port patches which have not been merged into
upstream. I hope I can pushed them out by the end of the next week.

I have sent out all my patches (11). 3 of them have been reviewed and 
committed. Others are being reviewed. I have no access to computer this 
weekend. I'll be back next Monday or Tuesday.



Jie


Re: Division by zero

2007-02-10 Thread Jie Zhang

On 2/10/07, Robert Dewar [EMAIL PROTECTED] wrote:

Ian Lance Taylor wrote:
 Jie Zhang [EMAIL PROTECTED] writes:

 But now gcc seems to optimize it away. For the following function:

 $ cat t.c
 #include limits.h
 void foo (int rc)
 {
   int x = rc / INT_MAX;
   x = 4 / x;
 }

 I believe we still keep division by zero in general.  In your example
 it gets optimized away because it is dead code.  Nothing uses x.

And it is certainly reasonable to do this optimization given that
the result of the division is undefined in C. In Ada, such a
division has well defined semantics (raise an exception), but
it is interesting to note that the optimization is valid in
Ada as well, since there is a special rule that basically says
you don't need to evaluate an expression if the only reason for
doing so is to see if it raises a predefined exception. That
rule is precisely to deal with cases like this.


The code I posted in my first email is from libgloss/libnosys/_exit.c.
It's used to cause an exception deliberately. From your replies, it
seems it should find another way to do that.

Thanks,
Jie


Re: Division by zero

2007-02-10 Thread Jie Zhang

On 2/10/07, Robert Dewar [EMAIL PROTECTED] wrote:

Jie Zhang wrote:

 The code I posted in my first email is from libgloss/libnosys/_exit.c.
 It's used to cause an exception deliberately. From your replies, it
 seems it should find another way to do that.

Any code that tries to raise an exception deliberately is certainly
depending on undefined behavior, so it has to be very careful about
how it is written!


I'm going to use an asm ().

Jie


Re: Division by zero

2007-02-10 Thread Jie Zhang

On 2/10/07, Steven Bosscher [EMAIL PROTECTED] wrote:

On 2/10/07, Jie Zhang [EMAIL PROTECTED] wrote:
 The code I posted in my first email is from libgloss/libnosys/_exit.c.
 It's used to cause an exception deliberately. From your replies, it
 seems it should find another way to do that.

Maybe you can use __builtin_trap() ?


The exception generated by __builtin_trap () might have been used by
stack limit checking. Reusing it in _exit () might confuse the
exception handler I think.

Jie


Re: Division by zero

2007-02-10 Thread Jie Zhang

On 2/11/07, Paolo Bonzini [EMAIL PROTECTED] wrote:


 I'm going to use an asm ().

Yeah, an asm volatile ( : : r (x) : ) should please GCC and still be
portable to different platforms.


I thought using an asm () for each port to cause an exception specific
for that port. Such that divide-by-zero exception can be distinguished
from termination exception. However, asm volatile ( : : r (x)) is
good for ports that not providing their specific asms. I'll mention it
in the email to newlib mailing list.

Thanks,
Jie


Division by zero

2007-02-09 Thread Jie Zhang

Hi,

Division by zero is undefined. We chose to keep it:

http://gcc.gnu.org/ml/gcc-patches/2001-06/msg01068.html

But now gcc seems to optimize it away. For the following function:

$ cat t.c
#include limits.h
void foo (int rc)
{
 int x = rc / INT_MAX;
 x = 4 / x;
}

$ gcc -O2 -S t.c
$ cat t.s
   .file   t.c
   .text
   .p2align 4,,15
.globl foo
   .type   foo, @function
foo:
   pushl   %ebp
   movl%esp, %ebp
   popl%ebp
   ret
   .size   foo, .-foo
   .ident  GCC: (GNU) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
   .section.note.GNU-stack,,@progbits

Does we now choose to optimize it away now?


Jie


-fvtable-gc

2006-09-27 Thread Jie Zhang

It should has been removed from c.opt in the patch:

http://gcc.gnu.org/ml/gcc-patches/2003-07/msg02660.html.

But it's still in trunk and branches 3.4/4.0/4.1/4.2.


Jie


Re: apply for the relevant forms

2006-06-07 Thread Jie Zhang

On 6/5/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

Our Co. have a new 32b embedded processor, and we have ported the gcc
backend for it(support c/c++), now we want add its backend code into gcc
packages. i read the Contributing to GCC  pages that we must sign some
forms, can you kindly send the forms to me?


It said on http://gcc.gnu.org/contribute.html:  It's a good idea to
send [EMAIL PROTECTED] a copy of your request., not gcc mailing
list.

And [EMAIL PROTECTED] and gcc@gcc.gnu.org are same thing, no need to send to 
both.

Jie


Re: Object size checking builtin test case and uClibc

2006-03-17 Thread Jie Zhang

Jie Zhang wrote:

Hi,

In gcc.c-torture/execute/builtins/lib/chk.c, vsnprintf () is defined
using vsprintf (). While vsnprintf () in uClibc is defined using

   ^ Sorry, should be vsprintf

vsnprintf (). When testing on uClinux with uClibc, pr23484-chk.c
failed because these two functions will call each other recursively
and finally overflow the stack. How can this problem be fixed, In the
test case or in uClibc?



Jie


Re: Object size checking builtin test case and uClibc

2006-03-17 Thread Jie Zhang
On 3/18/06, Jie Zhang [EMAIL PROTECTED] wrote:
 Jie Zhang wrote:
  Hi,
 
  In gcc.c-torture/execute/builtins/lib/chk.c, vsnprintf () is defined
  using vsprintf (). While vsnprintf () in uClibc is defined using
 ^ Sorry, should be vsprintf
  vsnprintf (). When testing on uClinux with uClibc, pr23484-chk.c
  failed because these two functions will call each other recursively
  and finally overflow the stack. How can this problem be fixed, In the
  test case or in uClibc?
 
I removed snprintf () and vsnprintf () from
gcc.c-torture/execute/builtins/lib/chk.c. All the test cases in
builtins.exp pass for bfin port gcc 4.1 on uClinux. Can we remove
these two non-_chk versions from chk.c and use the ones from C
libraries?

Thanks,
Jie


Help needed on libgcc.a

2006-03-07 Thread Jie Zhang
Hi,

I'm adding some assembly floating point functions to bfin port. These
functions are much faster than those in fp-bit.c. However, they relax
some IEEE floating point standard rules for checking inputs against
NaN. So I think we'd better to call them only when -ffast-math or
-ffinite-math-only is added.

What I want is tell gcc to link specific libgcc-fast-fp.a, which
contains these assembly functions if there is -ffast-match or
-ffinite-math-only. Otherwise, let gcc link ordinary libgcc.a. Is
there any existing target doing this or similar things?


Thanks,
Jie


Which program can I use to see VCG dumping from GCC

2006-01-26 Thread Jie Zhang
Hi,

In this page http://gcc.gnu.org/news/egcs-vcg.html, it's said that
If you view these files using a suitable program, you'll get output
similar to the following. However, when I use xvcg to view
test.c.01.sibling.vcg, xvcg errors:

Wait.aLine 5: attribute T_Co_hidden currently not implemented !
...aLine 406: attribute T_Co_hidden currently not implemented !
.eSegmentation fault

I'm using latest Ubuntu Dapper. gcc version is gcc (GCC) 4.0.3
20060115 (prerelease) (Ubuntu 4.0.2-7ubuntu1). test.c is just the
example used in the above HTML page. Which other program should I use
to view the VCG dump?

Thanks,
Jie


Re: Which program can I use to see VCG dumping from GCC

2006-01-26 Thread Jie Zhang
On 1/26/06, Jie Zhang [EMAIL PROTECTED] wrote:
 Hi,

 In this page http://gcc.gnu.org/news/egcs-vcg.html, it's said that
 If you view these files using a suitable program, you'll get output
 similar to the following. However, when I use xvcg to view
 test.c.01.sibling.vcg, xvcg errors:

 Wait.aLine 5: attribute T_Co_hidden currently not implemented !
 ...aLine 406: attribute T_Co_hidden currently not implemented !
 .eSegmentation fault

 I'm using latest Ubuntu Dapper. gcc version is gcc (GCC) 4.0.3
 20060115 (prerelease) (Ubuntu 4.0.2-7ubuntu1). test.c is just the
 example used in the above HTML page. Which other program should I use
 to view the VCG dump?

Oops! It seems a bug of xvcg in Ubuntu. I built one from the source
package and it works well.

Jie