Re: [PATCH, i386]: Fix MULTILIB_MATCHES option typos in t-rtems

2013-11-06 Thread Ralf Corsepius

On 11/05/2013 07:54 PM, Uros Bizjak wrote:

On Tue, Nov 5, 2013 at 7:43 PM, Ralf Corsepius ralf.corsep...@gmail.com wrote:


2013-11-05  Uros Bizjak  ubiz...@gmail.com

  * config/i386/t-rtems (MULTILIB_MATCHES): Fix option typos.

Committed to SVN mainline as obvious, should be backported to other
release branches.

Yikes! Could you please apply it to all open GCC branches?

Done.

Thanks,

Ralf


Re: [PATCH, SH] Implement builtin_strlen

2013-11-06 Thread Christian Bruel

On 11/05/2013 02:12 PM, Kaz Kojima wrote:
 Christian Bruel christian.br...@st.com wrote:
 No regressions for sh-none-elf. OK for trunk ?
 OK.

 Regards,
   kaz

thanks, applied together with the cleanup referenced earlier and a
slight variable renaming (start_addr-curr_addr, end_addr-start_addr)
for readability as obvious,

Christian


Commit: MSP430: Make -mmcu option define C preprocessor symbol and linker scripts

2013-11-06 Thread Nick Clifton
Hi Guys,

  I am checking in the patch below to extend the -mmcu command line
  option of the MSP430 port so that it defines a C preprocessor symbol
  based on the name of the MCU part and it also changes the linker
  command line so that it will search for linker scripts specific to
  that part.

Cheers
  Nick

gcc/ChangeLog
2013-11-06  Nick Clifton  ni...@redhat.com

* config/msp430/msp430.h (TARGET_CPU_CPP_BUILTINS): Define the
name returned by msp430_mcu_name.
(LIB_SPEC): If a -T option has not been specified then set a
default, mcu-specific, linker script.
* config/msp430/t-msp430 (MULTILIB_MATCHES): Add more mcu names.
* config/msp430/msp430.c (msp430x_names): Likewise.
Alpha sort the names for ease of comparison.
(msp430_mcu_name): New function:  Returns a string suitable for
use as a C preprocessor symbol based upon the name of the MCU
being targeted.
(msp430_option_override): Accept msp430x and msp430xv2 as generic
mcu names.
* config/msp430/msp430-protos.h (msp430_mcu_name): Prototype.

Index: gcc/config/msp430/msp430-protos.h
===
--- gcc/config/msp430/msp430-protos.h	(revision 20)
+++ gcc/config/msp430/msp430-protos.h	(working copy)
@@ -35,6 +35,7 @@
 int	msp430_initial_elimination_offset (int, int);
 boolmsp430_is_interrupt_func (void);
 const char * msp430x_logical_shift_right (rtx);
+const char * msp430_mcu_name (void);
 bool	msp430_modes_tieable_p (enum machine_mode, enum machine_mode);
 void	msp430_output_labelref (FILE *, const char *);
 void	msp430_register_pragmas (void);
Index: gcc/config/msp430/msp430.c
===
--- gcc/config/msp430/msp430.c	(revision 20)
+++ gcc/config/msp430/msp430.c	(working copy)
@@ -109,69 +109,87 @@
 #undef  TARGET_OPTION_OVERRIDE
 #define TARGET_OPTION_OVERRIDE		msp430_option_override
 
+/* This list provides a set of known MCU names that support the MSP430X
+   ISA.  The list has been provided by TI and should be kept in sync with
+   the ones in:
+   
+ gcc/config/msp430/t-msp430
+ gas/config/tc-msp430.c
+
+   FIXME: We ought to read the names in from a file at run, rather
+   than having them built in like this.  Also such a file should be
+   shared with gas.  */
+
 static const char * msp430x_names [] =
 {
-  msp430x,   /* Generic name.  */
-  msp430xv2, /* Generic name.  */
-
-  /* These names have been provided by TI and match the names currently
- supported by GAS.
-
- NB/ This list should be kept in sync with the ones in:
-   gcc/config/msp430/t-msp430
-   gas/config/tc-msp430.c
-
- FIXME: We ought to read the names in from a file at run, rather
- than having them built in like this.  Also such a file should be
- shared with gas.  */
-  
-  msp430cg4616, msp430cg4617, msp430cg4618, msp430cg4619, msp430f2416,
-  msp430f2417,  msp430f2418,  msp430f2419,  msp430f2616,  msp430f2617,
-  msp430f2618,  msp430f2619,  msp430f47126, msp430f47127, msp430f47163,
-  msp430f47173, msp430f47183, msp430f47193, msp430f47166, msp430f47176,
-  msp430f47186, msp430f47196, msp430f47167, msp430f47177, msp430f47187,
-  msp430f47197, msp430f46161, msp430f46171, msp430f46181, msp430f46191,
-  msp430f4616,  msp430f4617,  msp430f4618,  msp430f4619,  msp430fg4616,
-  msp430fg4617, msp430fg4618, msp430fg4619, msp430f5418,  msp430f5419,
-  msp430f5435,  msp430f5436,  msp430f5437,  msp430f5438,  msp430f5418a,
-  msp430f5419a, msp430f5435a, msp430f5436a, msp430f5437a, msp430f5438a,
-  msp430f5212,  msp430f5213,  msp430f5214,  msp430f5217,  msp430f5218,
-  msp430f5219,  msp430f5222,  msp430f5223,  msp430f5224,  msp430f5227,
-  msp430f5228,  msp430f5229,  msp430f5304,  msp430f5308,  msp430f5309,
-  msp430f5310,  msp430f5340,  msp430f5341,  msp430f5342,  msp430f5324,
-  msp430f5325,  msp430f5326,  msp430f5327,  msp430f5328,  msp430f5329,
+  cc430f5123,   cc430f5125,   cc430f5133,   cc430f5135,   cc430f5137,
+  cc430f5143,   cc430f5145,   cc430f5147,   cc430f6125,   cc430f6126,
+  cc430f6127,   cc430f6135,   cc430f6137,   cc430f6143,   cc430f6145,
+  cc430f6147,   msp430bt5190, msp430cg4616, msp430cg4617, msp430cg4618,
+  msp430cg4619, msp430f2416,  msp430f2417,  msp430f2418,  msp430f2419,
+  msp430f2616,  msp430f2617,  msp430f2618,  msp430f2619,  msp430f4616,
+  msp430f46161, msp430f4617,  msp430f46171, msp430f4618,  msp430f46181,
+  msp430f4619,  msp430f46191, msp430f47126, msp430f47127, msp430f47163,
+  msp430f47166, msp430f47167, msp430f47173, msp430f47176, msp430f47177,
+  msp430f47183, msp430f47186, msp430f47187, msp430f47193, msp430f47196,
+  msp430f47197, msp430f5131,  msp430f5132,  msp430f5151,  msp430f5152,
+  msp430f5171,  msp430f5172,  msp430f5212,  msp430f5213,  msp430f5214,
+  msp430f5217,  msp430f5218,  msp430f5219,  msp430f5222,  msp430f5223,
+  msp430f5224,  msp430f5227,  msp430f5228,  msp430f5229,  

Re: patch implementing a new pass for register-pressure relief through live range shrinkage

2013-11-06 Thread Richard Biener
On Tue, Nov 5, 2013 at 4:35 PM, Vladimir Makarov vmaka...@redhat.com wrote:
   I'd like to add a new experimental optimization to the trunk.  This
 optimization was discussed on RA BOF of this summer GNU Cauldron.

   It is a register pressure relief through live-range shrinkage.  It
 is implemented on the scheduler base and uses register-pressure insn
 scheduling infrastructure.  By rearranging insns we shorten pseudo
 live-ranges and increase a chance to them be assigned to a hard
 register.

   The code looks pretty simple but there are a lot of works behind
 this patch.  I've tried about ten different versions of this code
 (different heuristics for two currently existing register-pressure
 algorithms).

   I think it is *upto target maintainers* to decide to use or not to
 use this optimization for their targets.  I'd recommend to use this at
 least for x86/x86-64.  I think any OOO processor with small or
 moderate register file which does not use the 1st insn scheduling
 might benefit from this too.

   On SPEC2000 for x86/x86-64 (I use Haswell processor, -O3 with
 general tuning), the optimization usage results in smaller code size
 in average (for floating point and integer benchmarks in 32- and
 64-bit mode).  The improvement better visible for SPECFP2000 (although
 I have the same improvement on x86-64 SPECInt2000 but it might be
 attributed mostly mcf benchmark unstability).  It is about 0.5% for
 32-bit and 64-bit mode.  It is understandable, as the optimization has
 more opportunities to improve the code on longer BBs.  Different from
 other heuristic optimizations, I don't see any significant worse
 performance.  It gives practically the same or better performance (a
 few benchmarks imporoved by 1% or more upto 3%).

   The single but significant drawback is additional compilation time
 (4%-6%) as the 1st insn scheduling pass is quite expensive.  So I'd
 recommend target maintainers to switch it on only for -Ofast.

Generally I'd not recomment viewing -Ofast as -O4 but as -O3
plus generally unsafe optimizations.  So I'd not enable it for -Ofast
but for -O3 - possibly also with -Os if indeed the main motivation is
also code-size improvements (-Os is a similar beast as -O3, spend
as much time as you can on optimizing size).

Btw, thanks for working on this.  How does it relate to
-fsched-pressure?  Does it treat all register classes the same?
On x86 mostly the few fixed registers for some of the integer pipeline
instructions hurt, x86_64 has enough general and FP registers?

Richard.

  If
 somebody finds that the optimization works on processors which uses
 1st insn scheduling by default (in which I slightly doubt), we could
 improve the compilation time by reusing data for this optimization and
 the 1st insn scheduling.

   Any comments, questions, thoughts are appreciated.

 2013-11-05  Vladimir Makarov  vmaka...@redhat.com

 * tree-pass.h (make_pass_live_range_shrinkage): New external.
 * timevar.def (TV_LIVE_RANGE_SHRINKAGE): New.
 * sched-rgn.c (gate_handle_live_range_shrinkage): New.
 (rest_of_handle_live_range_shrinkage): Ditto
 (class pass_live_range_shrinkage): Ditto.
 (pass_data_live_range_shrinkage): Ditto.
 (make_pass_live_range_shrinkage): Ditto.
 * sched-int.h (sched_relief_p): New external.
 * sched-deps.c (create_insn_reg_set): Make void return value.
 * passes.def: Add pass_live_range_shrinkage.
 * ira.c (update_equiv_regs): Don't move if
 flag_live_range_shrinkage.
 * haifa-sched.c (sched_relief_p): New.
 (rank_for_schedule): Add code for pressure relief through live
 range shrinkage.
 (schedule_insn): Print more debug info.
 (sched_init): Setup SCHED_PRESSURE_WEIGHTED for pressure relief
 through live range shrinkage.
 * doc/invoke.texi (-flive-range-shrinkage): New.
 * common.opt (flive-range-shrinkage): New.



Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Richard Biener
On Tue, Nov 5, 2013 at 3:24 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 1:52 PM, Ilya Enkovich enkovich@gmail.com wrote:

 For input parameter P I need to have
   BOUNDS = __builtin_arg_bnd (P)
 to somehow refer to bounds of P in GIMPLE.  Optimizations may modify
 __builtin_arg_bnd (P) replacing P with its copy or some value. It
 makes call useless because removes information about parameter whose
 bounds we refer to. I want such optimization to ignore
 __builtin_arg_bnd calls and always leave default SSA_NAME of PARM_DECL
 there as arg.

 How does a compilable testcase look like that shows how the default def
 is used?  And what transforms break that use?  I really cannot see
 how this would happen (and you didn't add a testcase).

 Here is a test source:

 extern int bar1 (int *p);
 extern int bar2 (int);

 int foo (int *p)
 {
   if (!p)
 return bar1 (p);
   return bar2 (10);
 }

 After instrumentation GIMPLE looks like:

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 6:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (p_3(D));

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   _6 = bar1 (p_3(D), __bound_tmp.0_9);
   goto bb 5;

   bb 4:
   _8 = bar2 (10);

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Here is optimized GIMPLE (if I do not apply my changes in tree-ssa-dom.c):

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (0B); [return slot optimization]
   _6 = bar1 (0B, __bound_tmp.0_9); [tail call]
   goto bb 5;

   bb 4:
   _8 = bar2 (10); [tail call]

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Now during expand or inline of foo I cannot determine the value for
 __bound_tmp.0_9.

Well, but clearly you can see that bounds for p == NULL are
useless and thus there is no problem with the transform.

Also sth weird is going on as your arg_bnd call gets [return slot optimization].
It seems that your POINTER_BOUNDS_TYPE is seen as aggregate somehow.

Richard.

 Ilya

 Richard.



Re: [patch] Move block_in_transaction out of gimple.h

2013-11-06 Thread Richard Biener
On Tue, Nov 5, 2013 at 10:08 PM, Jeff Law l...@redhat.com wrote:
 On 11/05/13 14:06, Andrew MacLeod wrote:

 Looks like another location of convenience perhaps...

 Anyway, block_in_transaction (bb) really belongs in basic-block.h... The
 only oddity is that it also checks flag_tm...   Is this really
 necessary? One would think the flag would never be set if flag_tm wasn't
 on...

 In any case, basic-block.h is already picking options.h up through
 function.h which includes tm.h.   And regardless, it does belong here...

 Bootstraps on x86_64-unknown-linux-gnu and current running regressions.
 Assuming its clean, OK?

 OK.  I wouldn't lose any sleep of that test were removed.   In fact, please
 remove it :-)  I wouldn't want someone to see that code and think hey,
 BB_IN_TRANSACTION is guarded, let's reuse that bit for something else when
 not compiling for TM.  We've done far too much of that through the years ;(

I think it's not computed for !flag_tm, but I agree.

Richard.


 Jeff


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Richard Biener
On Tue, Nov 5, 2013 at 10:18 PM, Jeff Law l...@redhat.com wrote:
 On 10/31/13 10:26, David Malcolm wrote:

 The gimple statement types are currently implemented using a hand-coded
 C inheritance scheme, with a union gimple_statement_d holding the
 various possible structs for a statement.

 The following series of patches convert it to a C++ hierarchy, using the
 existing structs, eliminating the union. The gimple typedef changes
 from being a
(union gimple_statement_d *)
 to being a:
(struct gimple_statement_base *)

 There are no virtual functions in the new code: the sizes of the various
 structs are unchanged.

 It makes use of is-a.h, using the as_a T template function to
 perform downcasts, which are checked (via gcc_checking_assert) in an
 ENABLE_CHECKING build, and are simple casts in an unchecked build,
 albeit it in an inlined function rather than a macro.

 For example, one can write:

gimple_statement_phi *phi =
  as_a gimple_statement_phi (gsi_stmt (gsi));

 and then directly access the fields of the phi, as a phi.  The existing
 accessor functions in gimple.h become somewhat redundant in this
 scheme, but are preserved.

 The earlier versions of the patches made all of the types GTY((user))
 and provided hand-written implementations of the gc and pch marker
 routines.  In this new version we rely on the support for simple
 inheritance that I recently added to gengtype, by adding a desc
 to the GTY marking for the base class, and a tag to the marking
 for all of the concrete subclasses.  (I say class, but all the types
 remain structs since their fields are all publicly accessible).

 As noted in the earlier patch, I believe this is a superior scheme to
 the C implementation:

* We can get closer to compile-time type-safety, checking the gimple
  code once and downcasting with an as_a, then directly accessing
  fields, rather than going through accessor functions that check
  each time.  In some places we may want to replace a gimple with
  a subclass e.g. phis are always of the phi subclass, to get full
  compile-time type-safety.

* This scheme is likely to be easier for newbies to understand.

* Currently in gdb, dereferencing a gimple leads to screenfuls of text,
  showing all the various union values.  With this, you get just the
 base
  class, and can cast it to the appropriate subclass.

* With this, we're working directly with the language constructs,
  rather than rolling our own, and thus other tools can better
  understand the code. (e.g. doxygen).

 Again, as noted in the earlier patch series, the names of the structs
 are rather verbose.  I would prefer to also rename them all to eliminate
 the _statement component:
gimple_statement_base - gimple_base
gimple_statement_phi  - gimple_phi
gimple_statement_omp  - gimple_omp
 etc, but I didn't do this to mimimize the patch size.  But if the core
 maintainers are up for that, I can redo the patch series with that
 change also, or do that as a followup.

 The patch is in 6 parts; all of them are needed together.

 And that's part of the problem.  There's understandable resistance to (for
 example) the as_a casting.

 There's a bit of natural tension between the desire to keep patches small
 and self-contained and the size/scope of the changes necessary to do any
 serious reorganization work.  This set swings too far in the latter
 direction :-)

 Is there any way to go forward without the is_a/as_a stuff?  ie, is there're
 a simpler step towards where we're trying to go that allows most of this to
 go forward now rather than waiting?

We decided to move to C++.  As part of a later discussion we decided
to go with a single general dynamic-casting style, mimicing the real
C++ variant which is dynamic_cast  ... .  Which resulted in
is-a.h.

So yes, we've decided to go C++ so we have to live with certain
uglinesses of that decisions (and maybe over time those uglinesses
will fade away and we get used to it and like it).

Thus, there isn't another option besides using the is-a.h machinery
and enabling and using RTTI.  Sticking to C for gimple doesn't seem
to be consistent with the decision to move to C++.

Oh, I'm not saying I'm a big fan of as_a / is_a or C++ in general
as it plays out right now.  But well, we've had the discussion and
had a decision.

Execute.

Richard.



* Patch 1 of 6: This patch adds inheritance to the various gimple
  types, eliminating the initial baseclass fields, and eliminating the
  union gimple_statement_d.   All the types remain structs.  They
  become marked with GTY(()), gaining GSS_ tag values.

* Patch 2 of 6: This patch ports various accessor functions within
  gimple.h to the new scheme.

* Patch 3 of 6: This patch is autogenerated by refactor_gimple.py
  from https://github.com/davidmalcolm/gcc-refactoring-scripts
  There is a test suite test_refactor_gimple.py which may give a
  

Re: [PATCH] PR58985: testcase error.

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 3:51 AM, Wei Mi w...@google.com wrote:
 +Release manager.

 Thanks, committed to trunk as r204438. Ok for 4.8 branch?

Ok.

 On Tue, Nov 5, 2013 at 11:19 AM, Jeff Law l...@redhat.com wrote:
 On 11/04/13 12:07, Wei Mi wrote:

 Hi,

 This is to fix testcase error reported in PR58985.

 The intention of the testcase was to ensure there was no REG_EQUIV
 notes generated for a reg which was used in a paradoxical subreg. When
 target was x86, there was subreg generated so I omitted to add the
 subreg in the regexp pattern. However there is no subreg generated for
 target cris-axis-elf, so REG_EQUIV should be allowed.

 Is it ok for trunk and gcc-4.8 branch?

 Thanks,
 Wei Mi.

 2013-11-04  Wei Mi  w...@google.com

  PR regression/58985
  * testsuite/gcc.dg/pr57518.c: Add subreg in regexp pattern.

 Fine for the trunk.  Release manager's call for the branch.

 jeff



Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 8:40 AM, Mingjie Xing mingjie.x...@gmail.com wrote:
 Hi,

 Changes:

 * c/c-decl.c (pop_scope): Skip volatile variables while emit
 warnings for unused variables.

 Tested on i686-pc-linux-gnu.

 OK?

You miss a testcase.

Also why should the warning be omitted for unused automatic
volatile variables?  They cannot be used in any way.

Richard.




 Mingjie


[ARM, AArch64] Make aarch-common.c files more robust.

2013-11-06 Thread James Greenhalgh

Hi,

This patch is a respin of the aarch-common improvements to use a generic
search to find SETs and the variety of shifts, rather than relying on the
hope that we will find something well formed.

I've bootstrapped the patch on a chromebook with -mcpu=cortex-a7,
 -mcpu=cortex-a9 and -mcpu=cortex-a15 with no issues, and run
the aarch64-none-elf regression suite with no issues. And I've thrown
a bunch of smoke tests at this over all supported -mcpu values
with no issues, and only a few minor scheduling differences where the
old code was too liberal and did the wrong thing.

Is this OK?

Thanks,
James

---
gcc/

2013-11-06  James Greenhalgh  james.greenha...@arm.com

* config/arm/aarch-common.c
(search_term): New typedef.
(shift_rtx_costs): New array.
(arm_rtx_shift_left_p): New.
(arm_find_sub_rtx_with_search_term): Likewise.
(arm_find_sub_rtx_with_code): Likewise.
(arm_early_load_addr_dep): Add sanity checking.
(arm_no_early_alu_shift_dep): Likewise.
(arm_no_early_alu_shift_value_dep): Likewise.
(arm_no_early_mul_dep): Likewise.
(arm_no_early_store_addr_dep): Likewise.
diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c
index 3111ae9..201e581 100644
--- a/gcc/config/arm/aarch-common.c
+++ b/gcc/config/arm/aarch-common.c
@@ -31,30 +31,139 @@
 #include c-family/c-common.h
 #include rtl.h
 
-/* Return nonzero if the CONSUMER instruction (a load) does need
-   PRODUCER's value to calculate the address.  */
+typedef struct
+{
+  rtx_code search_code;
+  rtx search_result;
+  bool find_any_shift;
+} search_term;
+
+/* Return TRUE if X is either an arithmetic shift left, or
+   is a multiplication by a power of two.  */
+static bool
+arm_rtx_shift_left_p (rtx x)
+{
+  enum rtx_code code = GET_CODE (x);
 
-int
-arm_early_load_addr_dep (rtx producer, rtx consumer)
+  if (code == MULT  CONST_INT_P (XEXP (x, 1))
+   exact_log2 (INTVAL (XEXP (x, 1)))  0)
+return true;
+
+  if (code == ASHIFT)
+return true;
+
+  return false;
+}
+
+static rtx_code shift_rtx_codes[] =
+  { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT,
+ROTATERT, ZERO_EXTEND, SIGN_EXTEND };
+
+/* Callback function for arm_find_sub_rtx_with_code.
+   DATA is safe to treat as a SEARCH_TERM, ST.  This will
+   hold a SEARCH_CODE.  PATTERN is checked to see if it is an
+   RTX with that code.  If it is, write SEARCH_RESULT in ST
+   and return 1.  Otherwise, or if we have been passed a NULL_RTX
+   return 0.  If ST.FIND_ANY_SHIFT then we are interested in
+   anything which can reasonably be described as a SHIFT RTX.  */
+static int
+arm_find_sub_rtx_with_search_term (rtx *pattern, void *data)
 {
-  rtx value = PATTERN (producer);
-  rtx addr = PATTERN (consumer);
-
-  if (GET_CODE (value) == COND_EXEC)
-value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (addr) == COND_EXEC)
-addr = COND_EXEC_CODE (addr);
-  if (GET_CODE (addr) == PARALLEL)
+  search_term *st = (search_term *) data;
+  rtx_code pattern_code;
+  int found = 0;
+
+  gcc_assert (pattern);
+  gcc_assert (st);
+
+  /* Poorly formed patterns can really ruin our day.  */
+  if (*pattern == NULL_RTX)
+return 0;
+
+  pattern_code = GET_CODE (*pattern);
+
+  if (st-find_any_shift)
 {
-  if (GET_CODE (XVECEXP (addr, 0, 0)) == RETURN)
-addr = XVECEXP (addr, 0, 1);
+  unsigned i = 0;
+
+  /* Left shifts might have been canonicalized to a MULT of some
+	 power of two.  Make sure we catch them.  */
+  if (arm_rtx_shift_left_p (*pattern))
+	found = 1;
   else
-addr = XVECEXP (addr, 0, 0);
+	for (i = 0; i  ARRAY_SIZE (shift_rtx_codes); i++)
+	  if (pattern_code == shift_rtx_codes[i])
+	found = 1;
+}
+
+  if (pattern_code == st-search_code)
+found = 1;
+
+  if (found)
+st-search_result = *pattern;
+
+  return found;
+}
+
+/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE.  */
+static rtx
+arm_find_sub_rtx_with_code (rtx pattern, rtx_code code, bool find_any_shift)
+{
+  search_term st;
+  int result = 0;
+
+  gcc_assert (pattern != NULL_RTX);
+  st.search_code = code;
+  st.search_result = NULL_RTX;
+  st.find_any_shift = find_any_shift;
+  result = for_each_rtx (pattern, arm_find_sub_rtx_with_search_term, st);
+  if (result)
+return st.search_result;
+  else
+return NULL_RTX;
+}
+
+/* Traverse PATTERN looking for any sub-rtx which looks like a shift.  */
+static rtx
+arm_find_shift_sub_rtx (rtx pattern)
+{
+  return arm_find_sub_rtx_with_code (pattern, ASHIFT, true);
+}
+
+/* PRODUCER and CONSUMER are two potentially dependant RTX.  PRODUCER
+   (possibly) contains a SET which will provide a result we can access
+   using the SET_DEST macro.  We will place the RTX which would be
+   written by PRODUCER in SET_SOURCE.
+   Similarly, CONSUMER (possibly) contains a SET which has an operand
+   we can access using 

[v3 patch] Fix spelling in regex headers

2013-11-06 Thread Jonathan Wakely
This simply fixes s/boundry/boundary/, which I wanted to do first
before some more regex refactoring I'm planning.

2013-11-06  Jonathan Wakely  jwakely@gmail.com

* include/bits/regex_automaton.h (_S_opcode_word_boundry): Rename to
_S_opcode_word_boundary.
* include/bits/regex_automaton.tcc: Likewise.
* include/bits/regex_executor.h (__detail::_Executor::_M_word_boundry):
Rename to _M_word_boundary.
* include/bits/regex_executor.tcc: Likewise.

Tested x86_64-linux, committed to trunk.
commit 9c72a2889302b33b4e13643c11414e8278a9ba34
Author: Jonathan Wakely jwakely@gmail.com
Date:   Wed Nov 6 09:13:10 2013 +

* include/bits/regex_automaton.h (_S_opcode_word_boundry): Rename to
_S_opcode_word_boundary.
* include/bits/regex_automaton.tcc: Likewise.
* include/bits/regex_executor.h (__detail::_Executor::_M_word_boundry):
Rename to _M_word_boundary.
* include/bits/regex_executor.tcc: Likewise.

diff --git a/libstdc++-v3/include/bits/regex_automaton.h 
b/libstdc++-v3/include/bits/regex_automaton.h
index 4fb5556..e630512 100644
--- a/libstdc++-v3/include/bits/regex_automaton.h
+++ b/libstdc++-v3/include/bits/regex_automaton.h
@@ -56,7 +56,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _S_opcode_backref,
   _S_opcode_line_begin_assertion,
   _S_opcode_line_end_assertion,
-  _S_opcode_word_boundry,
+  _S_opcode_word_boundary,
   _S_opcode_subexpr_lookahead,
   _S_opcode_subexpr_begin,
   _S_opcode_subexpr_end,
@@ -83,7 +83,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _StateIdT  _M_quant_index;
  // for _S_opcode_alternative or _S_opcode_subexpr_lookahead
  _StateIdT  _M_alt;
- // for _S_opcode_word_boundry or _S_opcode_subexpr_lookahead or
+ // for _S_opcode_word_boundary or _S_opcode_subexpr_lookahead or
  // quantifiers(ungreedy if set true)
  bool   _M_neg;
};
@@ -197,7 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _StateIdT
   _M_insert_word_bound(bool __neg)
   {
-   _StateT __tmp(_S_opcode_word_boundry);
+   _StateT __tmp(_S_opcode_word_boundary);
__tmp._M_neg = __neg;
return _M_insert_state(__tmp);
   }
diff --git a/libstdc++-v3/include/bits/regex_automaton.tcc 
b/libstdc++-v3/include/bits/regex_automaton.tcc
index c15e3e9..258d22d 100644
--- a/libstdc++-v3/include/bits/regex_automaton.tcc
+++ b/libstdc++-v3/include/bits/regex_automaton.tcc
@@ -93,7 +93,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __ostr  __id   [label=\  __id  \\nLINE_END \];\n
  __id   -   _M_next   [label=\epsilon\];\n;
  break;
-   case _S_opcode_word_boundry:
+   case _S_opcode_word_boundary:
  __ostr  __id   [label=\  __id  \\nWORD_BOUNDRY 
  _M_neg  \];\n
  __id   -   _M_next   [label=\epsilon\];\n;
diff --git a/libstdc++-v3/include/bits/regex_executor.h 
b/libstdc++-v3/include/bits/regex_executor.h
index 12db47b..57b3108 100644
--- a/libstdc++-v3/include/bits/regex_executor.h
+++ b/libstdc++-v3/include/bits/regex_executor.h
@@ -138,7 +138,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
   bool
-  _M_word_boundry(_State_CharT, _TraitsT __state) const;
+  _M_word_boundary(_State_CharT, _TraitsT __state) const;
 
   bool
   _M_lookahead(_State_CharT, _TraitsT __state);
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 0c42189..b310aba 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -257,8 +257,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  if (_M_at_end())
_M_dfs__match_mode(__state._M_next);
  break;
-   case _S_opcode_word_boundry:
- if (_M_word_boundry(__state) == !__state._M_neg)
+   case _S_opcode_word_boundary:
+ if (_M_word_boundary(__state) == !__state._M_neg)
_M_dfs__match_mode(__state._M_next);
  break;
// Here __state._M_alt offers a single start node for a sub-NFA.
@@ -344,11 +344,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 }
 
-  // Return whether now is at some word boundry.
+  // Return whether now is at some word boundary.
   templatetypename _BiIter, typename _Alloc, typename _TraitsT,
 bool __dfs_mode
 bool _Executor_BiIter, _Alloc, _TraitsT, __dfs_mode::
-_M_word_boundry(_State_CharT, _TraitsT __state) const
+_M_word_boundary(_State_CharT, _TraitsT __state) const
 {
   // By definition.
   bool __ans = false;


[PATCH GCC]Improve IVOPT to handle outside and inside loop iv uses differently in GCC

2013-11-06 Thread bin.cheng
Hi,
GCC IVOPT has a problem that it doesn't differentiate between iv uses
outside of loop from inside ones.  It computes cost for outside iv use just
like inside ones, which is wrong because outside iv use should be computed
along loop exit edge and the cost should be amortized against loop iteration
number.  Lastly, the computation of outside iv use is inserted in loop,
rather along loop exit edge.

This is interesting since usually outside iv use should be handled
differently, or it hurts optimization in several ways like:
1) Wrong iv candidate is chosen because of inaccurate cost.
2) Extra computation in loop itself is redundant.
3) Extra code computing outside iv use in loop may increases register
pressure because both iv variables before and after stepping could be alive
at same time.
4) IVOPT generates code that it expects to stay as is, passes like DOM tends
to break this because of the extra computation.  This hurts targets with
auto-increment support more.

This patch fixes the problem.  Bootstrap and test on x86/x86_64/arm.
Richard, Zdenek,  does this look reasonable?

Thanks,
bin


gcc/testsuite/ChangeLog
2013-11-06  Bin Cheng  bin.ch...@arm.com

* gcc.dg/tree-ssa/ivopts-outside-loop-use-1.c: New test.

2013-11-06  Bin Cheng  bin.ch...@arm.com

* tree-ssa-loop-ivopts.c (iv_use_p, iv_cand_p): Move around.
(iv_use_location): New.
(struct iv): Remove have_use_for and use_id.  New fields
inside_use, outside_uses_vec and use_loc.
(struct iv_use): New fields exit_edge and outside_use_p.
(struct edge_info, edge_info_p): New.
(struct ivopts_data): New fields alloc_uses_vecs, changed_bbs,
edge_map and edge_obstack.
(init_edge_info, get_edge_info): New.
(dump_use): Dump outside/inside information for iv use.
(tree_ssa_iv_optimize_init): Init new fields.
(alloc_iv): Init new fields.  Remove have_use_for and use_id.
(record_use): New parameter.  Record information for outside loop
iv use.
(find_interesting_uses_op): New parameter.  Handle inside and
outside loop iv uses.
(find_interesting_uses_cond, idx_record_use): Pass new argument.
(find_interesting_uses_address): Likewise.
(find_interesting_uses_stmt, create_new_iv): likewise.
(find_interesting_uses_outside): Rename exit to exit_edge.
New parameter normal_edge_p.  Pass new argument.
(find_interesting_uses): Find iv uses in two passes.
(get_computation): Compute cost at right position for iv use.
(determine_use_iv_cost_generic): Ajust cost for outside loop iv use.
(rewrite_use_outside_of_loop): New.
(rewrite_use): Call rewrite_use_outside_of_loop.
(remove_unused_ivs): Keep computation only for inner iv use.
(free_loop_data):  Reset outside_uses_vec in various iv structures.
Free alloc_uses_vecs and edge_map.
(tree_ssa_iv_optimize_finalize): Free and reset.
(tree_ssa_iv_optimize_loop): Create edge_map.
(tree_ssa_iv_optimize): Call rewrite_into_loop_closed_ssa if
necessary.Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-outside-loop-use-1.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ivopts-outside-loop-use-1.c   (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-outside-loop-use-1.c   (revision 0)
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-ivopts } */
+
+struct header {
+unsigned long size;
+struct header **ptr;
+};
+
+extern unsigned long top;
+
+int foo (struct header *start, int size)
+{
+  struct header *hd = start;
+  top = 0;
+  while (hd != 0  top+1  size)
+{
+  ++top;
+  hd = (hd-ptr)[4];
+}
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump iv2tmp.* = .* + \[0-9\]* ivopts } } */
+/* { dg-final { cleanup-tree-dump ivopts } } */
Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 204117)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -56,6 +56,24 @@ along with GCC; see the file COPYING3.  If not see
4) The trees are transformed to use the new variables, the dead code is
   removed.
 
+   Induction variable is possibly used outside of loop.  The algorithm
+   handles outside iv use like:
+  -- Marks iv use outside of loop by recording the corresponding exit
+edge along which the iv is used.  If an induction variable is
+used both inside and outside of loop, outside use will be ignored,
+as if it is escalated into inside one.
+  -- Computes the use cost against location of the end of source block
+of the exit edge, and amortize the cost against loop iteration
+number.
+  -- Inserts computation of outside iv use along the exit edge, rather
+than in basic block of current loop.
+   TODO: For degenerable phi node 

Re: [v3 patch] Implement C++14 N3655 TransformationTraits Redux

2013-11-06 Thread Paolo Carlini

Hi

On 11/06/2013 12:01 AM, Jonathan Wakely wrote:

I also changed a few tests to use static_assert instead of VERIFY so
they only need to be compiled, not executed, saving a few milliseconds
when running the testsuite ;-)
Good idea, anyway. I may get around to do more of that + those missing 
testcases ;)


Thanks!
Paolo.


RE: [PATCH GCC]Simplify address expression in IVOPT

2013-11-06 Thread bin.cheng


 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: Tuesday, November 05, 2013 8:18 PM
 To: Bin Cheng
 Cc: GCC Patches
 Subject: Re: [PATCH GCC]Simplify address expression in IVOPT
 
 On Tue, Nov 5, 2013 at 11:13 AM, bin.cheng bin.ch...@arm.com wrote:
 
 
  -Original Message-
  From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
  ow...@gcc.gnu.org] On Behalf Of bin.cheng
  Sent: Monday, November 04, 2013 4:35 PM
  To: 'Richard Biener'
  Cc: GCC Patches
  Subject: RE: [PATCH GCC]Simplify address expression in IVOPT
 
 
 
   -Original Message-
   From: Richard Biener [mailto:richard.guent...@gmail.com]
   Sent: Wednesday, October 30, 2013 10:46 PM
   To: Bin Cheng
   Cc: GCC Patches
   Subject: Re: [PATCH GCC]Simplify address expression in IVOPT
  
   On Tue, Oct 29, 2013 at 10:18 AM, bin.cheng bin.ch...@arm.com
 wrote:
  
   Hmm.  I think you want what get_inner_reference_aff does, using the
   return value of get_inner_reference as starting point for
  determine_base_object.
   And you don't want to restrict yourselves so much on what exprs to
  process,
   but only exclude DECL_Ps.
  Did you mean I should pass all address expressions to
  get_inner_reference_aff except the one with declaration operand?
  I am a little confused here why DECL_Ps should be handled specially?
  Seems
  it can be handled properly by the get_* function.  Anything important
  I missed?
 
   Just amend get_inner_reference_aff to return the tree base object.
  I found it's inappropriate to do this because: functions like
  get_inner_reference* only accept reference expressions, while
  base_object has to be computed for other kinds of expressions too.
  Take gimple statement a_1 = *ptr_1 as an example, the base passed
  to alloc_iv is
  ptr_1;
  the base_object computed by determine_base_object is also ptr_1,
  which we can't rely on get_inner_reference* .
 
  Also It seems the comment before determine_base_object might be
  inaccurate:
   Returns a memory object to that EXPR points. which I think is 
  Returns
  a
  pointer pointing to the main memory object to that EXPR points.
  Right?
  
   Note that this isn't really simplifying but rather lowering all
  addresses.
  
   The other changes in this patch are unrelated, right?
  Right, I should have named this message like refine cost computation
  of expressions in IVOPT just as the patch file.
 
  Hi,
  I updated the patch according to review comments.  Now it lowers all
  address expressions except ones with DECL_P to
  get_inner_reference_aff, it also computes base_object which is used
later
 to ease determine_base_object.
 
  Bootstrap and test on x86/x86_64/arm on going,  is it OK if tests pass?
 
  static struct iv *
  alloc_iv (tree base, tree step)
  {
 +  tree expr = base;
 +  tree base_object = base;
struct iv *iv = XCNEW (struct iv);
gcc_assert (step != NULL_TREE);
 
 +  /* Lower all address expressions except ones with DECL_P as oeprand.
 + By doing this:
 +   1) More accurate cost can be computed for address expressions;
 +   2) Duplicate candidates won't be created for bases in different
 +  forms, like a[0] and a.  */  STRIP_NOPS (expr);  if
 + (TREE_CODE (expr) == ADDR_EXPR  !DECL_P (TREE_OPERAND (expr, 0)))
 +{
 +  aff_tree comb;
 +  double_int size;
 +  base_object = get_inner_reference_aff (TREE_OPERAND (expr, 0),
 + comb, size);
 +  gcc_assert (base_object != NULL_TREE);
 +  base = fold_convert (TREE_TYPE (base), aff_combination_to_tree
 (comb));
 +}
 +
iv-base = base;
 -  iv-base_object = determine_base_object (base);
 +  iv-base_object = determine_base_object (base_object);
 
 
 for less confusion do not introduce 'expr' but just base_object
 (s/expr/base_object/).  Also can you split out this change (and the
related
 tree-affine.c one) from the rest?
 
 Ok with that change assuming it passes bootstrap  testing.

It passes test.  I split it into two patches as attached.
Are these two OK?

Thanks,
bin

3-ivopt-expr_cost-a-20131106:
gcc/testsuite/ChangeLog
2013-11-06  Bin Cheng  bin.ch...@arm.com

* gcc.dg/tree-ssa/loop-2.c: Refine check condition.
* gcc.dg/tree-ssa/ivopt_infer_2.c: Ditto.
* gcc.dg/tree-ssa/ivopt_mult_3.c: Ditto.

2013-11-06  Bin Cheng  bin.ch...@arm.com

* tree-ssa-loop-ivopts.c (alloc_iv): Lower address expressions.
* tree-affine.c (get_inner_reference_aff): Return base.
* tree-affine.h (get_inner_reference_aff): Change prototype.

3-ivopt-expr_cost-b-20131106:
2013-11-06  Bin Cheng  bin.ch...@arm.com

* tree-ssa-loop-ivopts.c (get_shiftadd_cost): Check equality
using operand_equal_p.
(force_expr_to_var_cost): Refactor the code.  Handle type
conversion.
(split_address_cost): Call force_expr_to_var_cost.Index: gcc/testsuite/gcc.dg/tree-ssa/loop-2.c
===
--- gcc/testsuite

Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-06 Thread Richard Biener
On Tue, Nov 5, 2013 at 3:08 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 2:37 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 2:20 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 2:02 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/4 Richard Biener richard.guent...@gmail.com:
 On Thu, Oct 31, 2013 at 10:11 AM, Ilya Enkovich 
 enkovich@gmail.com wrote:
 Hi,

 This patch adds support Pointer Bounds Checker into c-family and LTO 
 front-ends.  The main purpose of changes in front-end is to register 
 all statically initialized objects for checker.  We also need to 
 register such objects created by compiler.

 LTO is quite specific front-end.  For LTO Pointer Bounds Checker
 support macro just means it allows instrumented code as input because
 all instrumentation is performed before code is streamed out for LTO.

 But your patch doesn't even make use of the langhook...

 Use of langhook is in previous patch
 (http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02408.html). Here is
 it's part:

 diff --git a/gcc/toplev.c b/gcc/toplev.c
 index db269b7..0eaf081 100644
 --- a/gcc/toplev.c
 +++ b/gcc/toplev.c
 @@ -1282,6 +1282,15 @@ process_options (void)
and -ftree-loop-linear));
  #endif

 +  if (flag_check_pointer_bounds)
 +{
 +  if (targetm.chkp_bound_mode () == VOIDmode)
 +   error (-fcheck-pointers is not supported for this target);
 +
 +  if (!lang_hooks.chkp_supported)
 +   flag_check_pointer_bounds = 0;
 +}
 +
/* One region RA really helps to decrease the code size.  */
if (flag_ira_region == IRA_REGION_AUTODETECT)
  flag_ira_region

 If we try to use -fcheck-pointers -flto for some unsupported language,
 code will not be instrumented.

 What's the reason to have unsupported languages?

 For some languages (e.g. Java) Pointer Bounds Checker does not make
 sense at all. Others may require additional support in front-end. The
 primary target is C family where solved problem is more critical.

 What does break if you enable it for Java or other unsupported
 languages?  That is, if LTO is able to handle a mixed Java and
 C binary then why can Java alone not handle it?

 In such case checker will produce useless overhead in Java code.
 Resulting code will probably report some bound violations because Java
 FE may generate code which seems wrong for Pointer Bounds Checker.

So it's only an issue that if you use it that it may trip over Java FE bugs?
Not a good reason to have a langhook - you can use the existing
post_options langhook for disallowing this?

Thanks,
Richard.

 Ilya


 Richard.

 Ilya

 Richard.

 Ilya


 Richard.

 Ilya


 You define CHKP as supported in LTO.  That means it has to be supported
 for all languages and thus you should drop the langhook.

 Richard.

 Thanks,
 Ilya
 --

 gcc/

 2013-10-29  Ilya Enkovich  ilya.enkov...@intel.com

 * c/c-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * cp/cp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * objc/objc-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * objcp/objcp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * lto/lto-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * c/c-parser.c (c_parser_declaration_or_fndef): Register 
 statically
 initialized decls in Pointer Bounds Checker.
 * cp/decl.c (cp_finish_decl): Likewise.
 * gimplify.c (gimplify_init_constructor): Likewise.


 diff --git a/gcc/c/c-lang.c b/gcc/c/c-lang.c
 index 614c46d..a32bc6b 100644
 --- a/gcc/c/c-lang.c
 +++ b/gcc/c/c-lang.c
 @@ -43,6 +43,8 @@ enum c_language_kind c_language = clk_c;
  #define LANG_HOOKS_INIT c_objc_common_init
  #undef LANG_HOOKS_INIT_TS
  #define LANG_HOOKS_INIT_TS c_common_init_ts
 +#undef LANG_HOOKS_CHKP_SUPPORTED
 +#define LANG_HOOKS_CHKP_SUPPORTED true

  /* Each front end provides its own lang hook initializer.  */
  struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
 diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
 index 9ccae3b..65d83c8 100644
 --- a/gcc/c/c-parser.c
 +++ b/gcc/c/c-parser.c
 @@ -1682,6 +1682,12 @@ c_parser_declaration_or_fndef (c_parser 
 *parser, bool fndef_ok,
   maybe_warn_string_init (TREE_TYPE (d), init);
   finish_decl (d, init_loc, init.value,
init.original_type, asm_name);
 +
 + /* Register all decls with initializers in Pointer
 +Bounds Checker to generate required static bounds
 +initializers.  */
 + if (DECL_INITIAL (d) != error_mark_node)
 +   chkp_register_var_initializer (d);
 }
 }
   else
 diff --git a/gcc/cp/cp-lang.c b/gcc/cp/cp-lang.c
 index a7fa8e4..6d138bd 100644
 --- 

Re: [C++ Patch, Java related/RFC] PR 11006

2013-11-06 Thread Andrew Haley
On 11/04/2013 05:21 PM, Jason Merrill wrote:
 Surely it should be valid to allocate a Java boolean type.  Andrew,
 how should that work?

It's not allowed.  All objects that are allocated by new must be of
class type (i.e.  instances of a subclass of java.lang.Object), but
boolean is a primitive type.

Andrew.


[C++ Patch] PR 50436

2013-11-06 Thread Paolo Carlini

Hi,

in this bug, filed by Zack, we loop forever after error in 
constant_value_1. Straightforward thing to do, detect and break out. 
Tested x86_64-linux.


Thanks,
Paolo.


/cp
2013-11-06  Paolo Carlini  paolo.carl...@oracle.com

PR c++/50436
* init.c (constant_value_1): Break out of infinite loops during
error recovery.

/testsuite
2013-11-06  Paolo Carlini  paolo.carl...@oracle.com

PR c++/50436
* g++.dg/template/pr50436.C: New.
Index: cp/init.c
===
--- cp/init.c   (revision 204447)
+++ cp/init.c   (working copy)
@@ -1968,6 +1968,7 @@ constant_value_1 (tree decl, bool integral_p, bool
 CP_TYPE_CONST_NON_VOLATILE_P (TREE_TYPE (decl)
 {
   tree init;
+  tree save_decl = decl;
   /* If DECL is a static data member in a template
 specialization, we must instantiate it here.  The
 initializer for the static data member is not processed
@@ -2007,6 +2008,10 @@ constant_value_1 (tree decl, bool integral_p, bool
  || TREE_CODE (init) == STRING_CST)))
break;
   decl = unshare_expr (init);
+  if (decl == save_decl)
+   /* Otherwise, in some cases of excessive recursive template
+  instantation (c++/50436) we hang after error.  */
+   return error_mark_node;
 }
   return decl;
 }
Index: testsuite/g++.dg/template/pr50436.C
===
--- testsuite/g++.dg/template/pr50436.C (revision 0)
+++ testsuite/g++.dg/template/pr50436.C (working copy)
@@ -0,0 +1,17 @@
+// PR c++/50436
+
+template bool struct VI {};
+template typename T
+struct IP
+{
+  static const bool r = IPT::r;  // { dg-error template|expression }
+};
+template typename T struct V
+{
+  VIIPT::r vi;
+};
+struct X;
+struct Y
+{
+  VX v;
+};


Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-06 Thread Ilya Enkovich
2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:08 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 2:37 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 2:20 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 2:02 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/4 Richard Biener richard.guent...@gmail.com:
 On Thu, Oct 31, 2013 at 10:11 AM, Ilya Enkovich 
 enkovich@gmail.com wrote:
 Hi,

 This patch adds support Pointer Bounds Checker into c-family and LTO 
 front-ends.  The main purpose of changes in front-end is to register 
 all statically initialized objects for checker.  We also need to 
 register such objects created by compiler.

 LTO is quite specific front-end.  For LTO Pointer Bounds Checker
 support macro just means it allows instrumented code as input because
 all instrumentation is performed before code is streamed out for LTO.

 But your patch doesn't even make use of the langhook...

 Use of langhook is in previous patch
 (http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02408.html). Here is
 it's part:

 diff --git a/gcc/toplev.c b/gcc/toplev.c
 index db269b7..0eaf081 100644
 --- a/gcc/toplev.c
 +++ b/gcc/toplev.c
 @@ -1282,6 +1282,15 @@ process_options (void)
and -ftree-loop-linear));
  #endif

 +  if (flag_check_pointer_bounds)
 +{
 +  if (targetm.chkp_bound_mode () == VOIDmode)
 +   error (-fcheck-pointers is not supported for this target);
 +
 +  if (!lang_hooks.chkp_supported)
 +   flag_check_pointer_bounds = 0;
 +}
 +
/* One region RA really helps to decrease the code size.  */
if (flag_ira_region == IRA_REGION_AUTODETECT)
  flag_ira_region

 If we try to use -fcheck-pointers -flto for some unsupported language,
 code will not be instrumented.

 What's the reason to have unsupported languages?

 For some languages (e.g. Java) Pointer Bounds Checker does not make
 sense at all. Others may require additional support in front-end. The
 primary target is C family where solved problem is more critical.

 What does break if you enable it for Java or other unsupported
 languages?  That is, if LTO is able to handle a mixed Java and
 C binary then why can Java alone not handle it?

 In such case checker will produce useless overhead in Java code.
 Resulting code will probably report some bound violations because Java
 FE may generate code which seems wrong for Pointer Bounds Checker.

 So it's only an issue that if you use it that it may trip over Java FE bugs?
 Not a good reason to have a langhook - you can use the existing
 post_options langhook for disallowing this?

The issue is that users do not get what expect. I do not want someone
having mixed codes get instrumentation for his Java/Fortran/Ada
functions which slows them down and does nothing useful. What is the
point to allow checks of pointer bounds for language with no pointers?

If I use post_options, I need to change hooks of languages that do not
care about checker to disable it. Now I have it disabled by default.
Using post_options will also require empty redefinition of
post_options in C languages with empty body which may be confusing.

If you do not like this hook, I can move all Pointer Bounds Checker
flags into c.opt and remove the hook. Should it be OK?

Thanks,
Ilya


 Thanks,
 Richard.

 Ilya


 Richard.

 Ilya

 Richard.

 Ilya


 Richard.

 Ilya


 You define CHKP as supported in LTO.  That means it has to be 
 supported
 for all languages and thus you should drop the langhook.

 Richard.

 Thanks,
 Ilya
 --

 gcc/

 2013-10-29  Ilya Enkovich  ilya.enkov...@intel.com

 * c/c-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * cp/cp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * objc/objc-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * objcp/objcp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * lto/lto-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * c/c-parser.c (c_parser_declaration_or_fndef): Register 
 statically
 initialized decls in Pointer Bounds Checker.
 * cp/decl.c (cp_finish_decl): Likewise.
 * gimplify.c (gimplify_init_constructor): Likewise.


 diff --git a/gcc/c/c-lang.c b/gcc/c/c-lang.c
 index 614c46d..a32bc6b 100644
 --- a/gcc/c/c-lang.c
 +++ b/gcc/c/c-lang.c
 @@ -43,6 +43,8 @@ enum c_language_kind c_language = clk_c;
  #define LANG_HOOKS_INIT c_objc_common_init
  #undef LANG_HOOKS_INIT_TS
  #define LANG_HOOKS_INIT_TS c_common_init_ts
 +#undef LANG_HOOKS_CHKP_SUPPORTED
 +#define LANG_HOOKS_CHKP_SUPPORTED true

  /* Each front end provides its own lang hook initializer.  */
  struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
 diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
 index 9ccae3b..65d83c8 100644
 --- a/gcc/c/c-parser.c
 +++ 

Re: [PATCH, SH] Implement builtin_strlen

2013-11-06 Thread Eric Botcazou
 thanks, applied together with the cleanup referenced earlier and a
 slight variable renaming (start_addr-curr_addr, end_addr-start_addr)
 for readability as obvious,

Minor nit: no gcc/ prefix in gcc/ChangeLog.

-- 
Eric Botcazou


Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Ilya Enkovich
2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:24 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 1:52 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:

 For input parameter P I need to have
   BOUNDS = __builtin_arg_bnd (P)
 to somehow refer to bounds of P in GIMPLE.  Optimizations may modify
 __builtin_arg_bnd (P) replacing P with its copy or some value. It
 makes call useless because removes information about parameter whose
 bounds we refer to. I want such optimization to ignore
 __builtin_arg_bnd calls and always leave default SSA_NAME of PARM_DECL
 there as arg.

 How does a compilable testcase look like that shows how the default def
 is used?  And what transforms break that use?  I really cannot see
 how this would happen (and you didn't add a testcase).

 Here is a test source:

 extern int bar1 (int *p);
 extern int bar2 (int);

 int foo (int *p)
 {
   if (!p)
 return bar1 (p);
   return bar2 (10);
 }

 After instrumentation GIMPLE looks like:

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 6:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (p_3(D));

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   _6 = bar1 (p_3(D), __bound_tmp.0_9);
   goto bb 5;

   bb 4:
   _8 = bar2 (10);

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Here is optimized GIMPLE (if I do not apply my changes in tree-ssa-dom.c):

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (0B); [return slot optimization]
   _6 = bar1 (0B, __bound_tmp.0_9); [tail call]
   goto bb 5;

   bb 4:
   _8 = bar2 (10); [tail call]

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Now during expand or inline of foo I cannot determine the value for
 __bound_tmp.0_9.

 Well, but clearly you can see that bounds for p == NULL are
 useless and thus there is no problem with the transform.

This example just demonstrates the issue. I may use some var's address
in comparison with the similar result. And value does not always allow
to determine bounds, because pointers with the same value may have
different bounds.


 Also sth weird is going on as your arg_bnd call gets [return slot 
 optimization].
 It seems that your POINTER_BOUNDS_TYPE is seen as aggregate somehow.

Target is responsible for having return slot optimization here. And
target expands all such calls. So, do not see the problem here.

Ilya


 Richard.

 Ilya

 Richard.



Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Bernd Schmidt
On 11/06/2013 10:31 AM, Richard Biener wrote:
 We decided to move to C++.  As part of a later discussion we decided
 to go with a single general dynamic-casting style, mimicing the real
 C++ variant which is dynamic_cast  ... .  Which resulted in
 is-a.h.
 
 So yes, we've decided to go C++ so we have to live with certain
 uglinesses of that decisions (and maybe over time those uglinesses
 will fade away and we get used to it and like it).
 
 Thus, there isn't another option besides using the is-a.h machinery
 and enabling and using RTTI.  Sticking to C for gimple doesn't seem
 to be consistent with the decision to move to C++.
 
 Oh, I'm not saying I'm a big fan of as_a / is_a or C++ in general
 as it plays out right now.  But well, we've had the discussion and
 had a decision.

Maybe we need to revisit it? As one of those who were not in favour of
the C++ move, can I ask you guys to step back for a moment and think
about - what do all of these changes buy us, exactly? Imagine the state
at the end, where everything is converted and supposedly the temporary
ugliness is gone, what have we gained over the code as it is now?

I still think all this effort is misdirected and distracts us from
making changes that improve gcc for its users.


Bernd



[PATCH][1/2] Get TREE_OVERFLOW somewhat under control

2013-11-06 Thread Richard Biener

The following patch makes initial steps of reducing the amount of
TREE_OVERFLOW constants in the GIMPLE IL (there should be zero such
constants).  Tree folding in the frontends (and also later
during optimization) introduces those so the following patches
the gimplifier to drop the flag from all constants in GENERIC.
The idea is that we can eventually verify no TREE_OVERFLOW is
set form the verifier (no, I won't arrive there with this series).

TREE_OVERFLOW in the GIMPLE IL seriously confuses VRP which uses
the flag for its own purposes (ick) and relies on fold-const
setting TREE_OVERFLOW on overflows (double ick).  So the patch
also replaces as many TREE_OVERFLOW checks in tree-vrp.c with
the appropriate is_overflow_infinity check.

This also introduces a helper to drop TREE_OVERFLOW.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2013-11-06  Richard Biener  rguent...@suse.de

* tree.c (drop_tree_overflow): New function.
* tree.h (drop_tree_overflow): Declare.
* gimplify.c (gimplify_expr): Drop TREE_OVERFLOW.
* tree-vrp.c (range_int_cst_singleton_p): Use
is_overflow_infinity instead of testing TREE_OVERFLOW.
(extract_range_from_assert): Likewise.
(zero_nonzero_bits_from_vr): Likewise.
(extract_range_basic): Likewise.
(register_new_assert_for): Use drop_tree_overflow.
(vrp_visit_phi_node): Likewise.

Index: gcc/tree.c
===
*** gcc/tree.c  (revision 204449)
--- gcc/tree.c  (working copy)
*** get_tree_code_name (enum tree_code code)
*** 12542,12545 
--- 12542,12564 
return tree_code_name[code];
  }
  
+ /* Drops the TREE_OVERFLOW flag from T.  */
+ 
+ tree
+ drop_tree_overflow (tree t)
+ {
+   gcc_checking_assert (TREE_OVERFLOW (t));
+ 
+   /* For tree codes with a sharing machinery re-build the result.  */
+   if (TREE_CODE (t) == INTEGER_CST)
+ return build_int_cst_wide (TREE_TYPE (t),
+  TREE_INT_CST_LOW (t), TREE_INT_CST_HIGH (t));
+ 
+   /* Otherwise, as all tcc_constants are possibly shared, copy the node
+  and drop the flag.  */
+   t = copy_node (t);
+   TREE_OVERFLOW (t) = 0;
+   return t;
+ }
+ 
  #include gt-tree.h
Index: gcc/tree.h
===
*** gcc/tree.h  (revision 204449)
--- gcc/tree.h  (working copy)
*** extern tree get_base_address (tree t);
*** 4800,4805 
--- 4800,4806 
  extern void mark_addressable (tree);
  
  /* In tree.c.  */
+ extern tree drop_tree_overflow (tree);
  extern int tree_map_base_eq (const void *, const void *);
  extern unsigned int tree_map_base_hash (const void *);
  extern int tree_map_base_marked_p (const void *);
Index: gcc/gimplify.c
===
*** gcc/gimplify.c  (revision 204449)
--- gcc/gimplify.c  (working copy)
*** gimplify_expr (tree *expr_p, gimple_seq
*** 7897,7902 
--- 7897,7906 
case STRING_CST:
case COMPLEX_CST:
case VECTOR_CST:
+ /* Drop the overflow flag on constants, we do not want
+that in the GIMPLE IL.  */
+ if (TREE_OVERFLOW_P (*expr_p))
+   *expr_p = drop_tree_overflow (*expr_p);
  ret = GS_ALL_DONE;
  break;
  
Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c  (revision 204449)
--- gcc/tree-vrp.c  (working copy)
*** static inline bool
*** 892,899 
  range_int_cst_singleton_p (value_range_t *vr)
  {
return (range_int_cst_p (vr)
!  !TREE_OVERFLOW (vr-min)
!  !TREE_OVERFLOW (vr-max)
   tree_int_cst_equal (vr-min, vr-max));
  }
  
--- 892,899 
  range_int_cst_singleton_p (value_range_t *vr)
  {
return (range_int_cst_p (vr)
!  !is_overflow_infinity (vr-min)
!  !is_overflow_infinity (vr-max)
   tree_int_cst_equal (vr-min, vr-max));
  }
  
*** extract_range_from_assert (value_range_t
*** 1741,1747 
 all should be optimized away above us.  */
if ((cond_code == LT_EXPR
compare_values (max, min) == 0)
! || (CONSTANT_CLASS_P (max)  TREE_OVERFLOW (max)))
set_value_range_to_varying (vr_p);
else
{
--- 1741,1747 
 all should be optimized away above us.  */
if ((cond_code == LT_EXPR
compare_values (max, min) == 0)
! || is_overflow_infinity (max))
set_value_range_to_varying (vr_p);
else
{
*** extract_range_from_assert (value_range_t
*** 1781,1787 
 all should be optimized away above us.  */
if ((cond_code == GT_EXPR
compare_values (min, max) == 0)
! || (CONSTANT_CLASS_P (min)  TREE_OVERFLOW (min)))
set_value_range_to_varying (vr_p);
else
{
--- 

Re: [PATCH GCC]Simplify address expression in IVOPT

2013-11-06 Thread Richard Biener
 ivopts_data *data,
   *var_present = true;
   fd_ivopts_data = data;
   walk_tree (addr, find_depends, depends_on, NULL);
-  return new_cost (target_spill_cost[data-speed], 0);
+  return force_expr_to_var_cost (addr, data-speed);
 }

this seems to be an unrelated change - there are other spill-cost
users in ivopts as well, not sure why one or the other variant is used.

For IVOPTs which is quite fragile with code generation quality it
helps to split up patches as much as possible to allow easy
bisection when problems arise on targets.

So, you can commit the get_shiftadd_cost change, re-post a fixed
conversion-handling patch and split the split_address_cost piece,
motivating it?

Thanks,
Richard.


 Thanks,
 bin

 3-ivopt-expr_cost-a-20131106:
 gcc/testsuite/ChangeLog
 2013-11-06  Bin Cheng  bin.ch...@arm.com

 * gcc.dg/tree-ssa/loop-2.c: Refine check condition.
 * gcc.dg/tree-ssa/ivopt_infer_2.c: Ditto.
 * gcc.dg/tree-ssa/ivopt_mult_3.c: Ditto.

 2013-11-06  Bin Cheng  bin.ch...@arm.com

 * tree-ssa-loop-ivopts.c (alloc_iv): Lower address expressions.
 * tree-affine.c (get_inner_reference_aff): Return base.
 * tree-affine.h (get_inner_reference_aff): Change prototype.

 3-ivopt-expr_cost-b-20131106:
 2013-11-06  Bin Cheng  bin.ch...@arm.com

 * tree-ssa-loop-ivopts.c (get_shiftadd_cost): Check equality
 using operand_equal_p.
 (force_expr_to_var_cost): Refactor the code.  Handle type
 conversion.
 (split_address_cost): Call force_expr_to_var_cost.


Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 11:55 AM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:08 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 2:37 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 2:20 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 2:02 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/4 Richard Biener richard.guent...@gmail.com:
 On Thu, Oct 31, 2013 at 10:11 AM, Ilya Enkovich 
 enkovich@gmail.com wrote:
 Hi,

 This patch adds support Pointer Bounds Checker into c-family and 
 LTO front-ends.  The main purpose of changes in front-end is to 
 register all statically initialized objects for checker.  We also 
 need to register such objects created by compiler.

 LTO is quite specific front-end.  For LTO Pointer Bounds Checker
 support macro just means it allows instrumented code as input because
 all instrumentation is performed before code is streamed out for LTO.

 But your patch doesn't even make use of the langhook...

 Use of langhook is in previous patch
 (http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02408.html). Here is
 it's part:

 diff --git a/gcc/toplev.c b/gcc/toplev.c
 index db269b7..0eaf081 100644
 --- a/gcc/toplev.c
 +++ b/gcc/toplev.c
 @@ -1282,6 +1282,15 @@ process_options (void)
and -ftree-loop-linear));
  #endif

 +  if (flag_check_pointer_bounds)
 +{
 +  if (targetm.chkp_bound_mode () == VOIDmode)
 +   error (-fcheck-pointers is not supported for this target);
 +
 +  if (!lang_hooks.chkp_supported)
 +   flag_check_pointer_bounds = 0;
 +}
 +
/* One region RA really helps to decrease the code size.  */
if (flag_ira_region == IRA_REGION_AUTODETECT)
  flag_ira_region

 If we try to use -fcheck-pointers -flto for some unsupported language,
 code will not be instrumented.

 What's the reason to have unsupported languages?

 For some languages (e.g. Java) Pointer Bounds Checker does not make
 sense at all. Others may require additional support in front-end. The
 primary target is C family where solved problem is more critical.

 What does break if you enable it for Java or other unsupported
 languages?  That is, if LTO is able to handle a mixed Java and
 C binary then why can Java alone not handle it?

 In such case checker will produce useless overhead in Java code.
 Resulting code will probably report some bound violations because Java
 FE may generate code which seems wrong for Pointer Bounds Checker.

 So it's only an issue that if you use it that it may trip over Java FE bugs?
 Not a good reason to have a langhook - you can use the existing
 post_options langhook for disallowing this?

 The issue is that users do not get what expect. I do not want someone
 having mixed codes get instrumentation for his Java/Fortran/Ada
 functions which slows them down and does nothing useful. What is the
 point to allow checks of pointer bounds for language with no pointers?

 If I use post_options, I need to change hooks of languages that do not
 care about checker to disable it. Now I have it disabled by default.

No, you'd change the hooks to complain about the flag.

 Using post_options will also require empty redefinition of
 post_options in C languages with empty body which may be confusing.

 If you do not like this hook, I can move all Pointer Bounds Checker
 flags into c.opt and remove the hook. Should it be OK?

That works for me if for LTO it is enough to see the instrumented IL
to do the right thing, that is, no flag tests appear in middle-end code
[You can also simply add LTO to the list of supported languages of course,
still no flag checks hint at a proper design].

Richard.

 Thanks,
 Ilya


 Thanks,
 Richard.

 Ilya


 Richard.

 Ilya

 Richard.

 Ilya


 Richard.

 Ilya


 You define CHKP as supported in LTO.  That means it has to be 
 supported
 for all languages and thus you should drop the langhook.

 Richard.

 Thanks,
 Ilya
 --

 gcc/

 2013-10-29  Ilya Enkovich  ilya.enkov...@intel.com

 * c/c-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * cp/cp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * objc/objc-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * objcp/objcp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * lto/lto-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
 * c/c-parser.c (c_parser_declaration_or_fndef): Register 
 statically
 initialized decls in Pointer Bounds Checker.
 * cp/decl.c (cp_finish_decl): Likewise.
 * gimplify.c (gimplify_init_constructor): Likewise.


 diff --git a/gcc/c/c-lang.c b/gcc/c/c-lang.c
 index 614c46d..a32bc6b 100644
 --- a/gcc/c/c-lang.c
 +++ b/gcc/c/c-lang.c
 @@ -43,6 +43,8 @@ enum c_language_kind c_language = clk_c;
  #define 

Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 12:00 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:24 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 1:52 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:

 For input parameter P I need to have
   BOUNDS = __builtin_arg_bnd (P)
 to somehow refer to bounds of P in GIMPLE.  Optimizations may modify
 __builtin_arg_bnd (P) replacing P with its copy or some value. It
 makes call useless because removes information about parameter whose
 bounds we refer to. I want such optimization to ignore
 __builtin_arg_bnd calls and always leave default SSA_NAME of PARM_DECL
 there as arg.

 How does a compilable testcase look like that shows how the default def
 is used?  And what transforms break that use?  I really cannot see
 how this would happen (and you didn't add a testcase).

 Here is a test source:

 extern int bar1 (int *p);
 extern int bar2 (int);

 int foo (int *p)
 {
   if (!p)
 return bar1 (p);
   return bar2 (10);
 }

 After instrumentation GIMPLE looks like:

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 6:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (p_3(D));

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   _6 = bar1 (p_3(D), __bound_tmp.0_9);
   goto bb 5;

   bb 4:
   _8 = bar2 (10);

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Here is optimized GIMPLE (if I do not apply my changes in tree-ssa-dom.c):

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (0B); [return slot optimization]
   _6 = bar1 (0B, __bound_tmp.0_9); [tail call]
   goto bb 5;

   bb 4:
   _8 = bar2 (10); [tail call]

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Now during expand or inline of foo I cannot determine the value for
 __bound_tmp.0_9.

 Well, but clearly you can see that bounds for p == NULL are
 useless and thus there is no problem with the transform.

 This example just demonstrates the issue. I may use some var's address
 in comparison with the similar result. And value does not always allow
 to determine bounds, because pointers with the same value may have
 different bounds.

Some vars address can be used for better bounds deriving though.
Well, it clearly shows it's a get me good results issue and not
a correctness issue.  And for good results you trade in missed
general optimizations.  That doesn't sound very good to me.

Richard.


 Also sth weird is going on as your arg_bnd call gets [return slot 
 optimization].
 It seems that your POINTER_BOUNDS_TYPE is seen as aggregate somehow.

 Target is responsible for having return slot optimization here. And
 target expands all such calls. So, do not see the problem here.

 Ilya


 Richard.

 Ilya

 Richard.



Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 12:02 PM, Bernd Schmidt ber...@codesourcery.com wrote:
 On 11/06/2013 10:31 AM, Richard Biener wrote:
 We decided to move to C++.  As part of a later discussion we decided
 to go with a single general dynamic-casting style, mimicing the real
 C++ variant which is dynamic_cast  ... .  Which resulted in
 is-a.h.

 So yes, we've decided to go C++ so we have to live with certain
 uglinesses of that decisions (and maybe over time those uglinesses
 will fade away and we get used to it and like it).

 Thus, there isn't another option besides using the is-a.h machinery
 and enabling and using RTTI.  Sticking to C for gimple doesn't seem
 to be consistent with the decision to move to C++.

 Oh, I'm not saying I'm a big fan of as_a / is_a or C++ in general
 as it plays out right now.  But well, we've had the discussion and
 had a decision.

 Maybe we need to revisit it? As one of those who were not in favour of
 the C++ move, can I ask you guys to step back for a moment and think
 about - what do all of these changes buy us, exactly? Imagine the state
 at the end, where everything is converted and supposedly the temporary
 ugliness is gone, what have we gained over the code as it is now?

as_a gains us less runtime checking and more static type checking
which is good.

 I still think all this effort is misdirected and distracts us from
 making changes that improve gcc for its users.

That I agree to.  Instead of fixing the less than optimal separation / boundary
between frontends and the middle-end, or fixing several other long-standing
issues with GCC we spend a lot of time refactoring things to be C++.
But that was kind of part of the decision (though I remember that we
mainly wanted to convert containters and isolated stuff, not gimple
or trees (I bet that'll be next)).

Of course I don't see contributors of changes that improve gcc for its users
now wasting their time with converting code to C++.  That conversion
may slow down those people, but only so much.  It'll get more interesting
with branch maintainance ...

Richard.


 Bernd



Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 12:31:00PM +0100, Richard Biener wrote:
  Maybe we need to revisit it? As one of those who were not in favour of
  the C++ move, can I ask you guys to step back for a moment and think
  about - what do all of these changes buy us, exactly? Imagine the state
  at the end, where everything is converted and supposedly the temporary
  ugliness is gone, what have we gained over the code as it is now?
 
 as_a gains us less runtime checking and more static type checking
 which is good.

But that really affects --enable-checking=yes builds (and only cases where
things aren't inlined).  If the price for that is uglier and less readable
code, then the price is just too high.

  I still think all this effort is misdirected and distracts us from
  making changes that improve gcc for its users.
 
 That I agree to.  Instead of fixing the less than optimal separation / 
 boundary
 between frontends and the middle-end, or fixing several other long-standing
 issues with GCC we spend a lot of time refactoring things to be C++.
 But that was kind of part of the decision (though I remember that we
 mainly wanted to convert containters and isolated stuff, not gimple
 or trees (I bet that'll be next)).
 
 Of course I don't see contributors of changes that improve gcc for its users
 now wasting their time with converting code to C++.  That conversion
 may slow down those people, but only so much.  It'll get more interesting
 with branch maintainance ...

If the changes are done at least with some for users useful goal (like
perhaps making gcc usable as a library for JITting etc.), then I can see
the benefit, but as_s uglification doesn't seem to be beneficial to
users at all, and IMNSHO much better would be to instead
spend time on what is beneficial to users (restrict, vectorizer
cost model, vectorizer improvements, better diagnostics (say range
locations), ...).

Jakub


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Eric Botcazou
 Maybe we need to revisit it? As one of those who were not in favour of
 the C++ move, can I ask you guys to step back for a moment and think
 about - what do all of these changes buy us, exactly? Imagine the state
 at the end, where everything is converted and supposedly the temporary
 ugliness is gone, what have we gained over the code as it is now?

Without going as far as revisiting the decision, I'd like to ask whether the 
impact of the recent changes on the speed of the compiler has been evaluated.
I didn't conduct real measurements but bootstrap times on mainline, which 
were slightly increasing as usual over the course of the year, started to 
increase at a faster pace over the last couple of months, at least for me.

-- 
Eric Botcazou


Re: libsanitizer merge from upstream r191666

2013-11-06 Thread Jakub Jelinek
On Tue, Nov 05, 2013 at 01:49:43PM -0800, Evgeniy Stepanov wrote:
 This way we can't test kernel interfaces that are not used in glibc,
 like linux aio.

So you just test what you can test.  Why do you need to intercept kernel
aio when hardly anything uses it?  Also, what is the reason why say
stat* interceptors couldn't just use the glibc stat* (using dlsym
obtained pointers) and just do some work before and/or after)?

 Most of our team is travelling, and we won't be able to submit a
 proper fix until next week. Is this blocking anyone? Are you OK with
 disabling broken parts with #ifdefs for now? Most of this is used in
 the interceptors, and they can be disabled on an individual basis.

Bootstrap failures are certainly blocking lots of people, so if you could
disable known problematic parts with #ifdefs for now and resolve later, it
would be great.

Jakub


Re: [PATCH, SH] Implement builtin_strlen

2013-11-06 Thread Eric Botcazou
 ok I'll didn't notice this.,  do you mind that I cleanup the other
 entries in the same time ?

I already cleaned up some, but go ahead.

-- 
Eric Botcazou


Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Ilya Enkovich
2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:00 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:24 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 1:52 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:

 For input parameter P I need to have
   BOUNDS = __builtin_arg_bnd (P)
 to somehow refer to bounds of P in GIMPLE.  Optimizations may modify
 __builtin_arg_bnd (P) replacing P with its copy or some value. It
 makes call useless because removes information about parameter whose
 bounds we refer to. I want such optimization to ignore
 __builtin_arg_bnd calls and always leave default SSA_NAME of PARM_DECL
 there as arg.

 How does a compilable testcase look like that shows how the default def
 is used?  And what transforms break that use?  I really cannot see
 how this would happen (and you didn't add a testcase).

 Here is a test source:

 extern int bar1 (int *p);
 extern int bar2 (int);

 int foo (int *p)
 {
   if (!p)
 return bar1 (p);
   return bar2 (10);
 }

 After instrumentation GIMPLE looks like:

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 6:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (p_3(D));

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   _6 = bar1 (p_3(D), __bound_tmp.0_9);
   goto bb 5;

   bb 4:
   _8 = bar2 (10);

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Here is optimized GIMPLE (if I do not apply my changes in tree-ssa-dom.c):

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (0B); [return slot optimization]
   _6 = bar1 (0B, __bound_tmp.0_9); [tail call]
   goto bb 5;

   bb 4:
   _8 = bar2 (10); [tail call]

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Now during expand or inline of foo I cannot determine the value for
 __bound_tmp.0_9.

 Well, but clearly you can see that bounds for p == NULL are
 useless and thus there is no problem with the transform.

 This example just demonstrates the issue. I may use some var's address
 in comparison with the similar result. And value does not always allow
 to determine bounds, because pointers with the same value may have
 different bounds.

 Some vars address can be used for better bounds deriving though.
 Well, it clearly shows it's a get me good results issue and not
 a correctness issue.  And for good results you trade in missed
 general optimizations.  That doesn't sound very good to me.

That is definitely a correctness issue.  Compiler cannot assume bounds
by pointer value ignoring bounds passed to the function.
SSA_NAME_OCCURS_IN_ABNORMAL_PHI usage results in missed optimizations.
 My original patch does not.

Ilya


 Richard.


 Also sth weird is going on as your arg_bnd call gets [return slot 
 optimization].
 It seems that your POINTER_BOUNDS_TYPE is seen as aggregate somehow.

 Target is responsible for having return slot optimization here. And
 target expands all such calls. So, do not see the problem here.

 Ilya


 Richard.

 Ilya

 Richard.



Re: [RFA][PATCH] Minor fix to aliasing machinery

2013-11-06 Thread Marc Glisse

[Discussion started in http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02472.html ]

On Wed, 30 Oct 2013, Marc Glisse wrote:


On Wed, 30 Oct 2013, Richard Biener wrote:


Btw, get_addr_base_and_unit_offset may also return an offsetted
MEM_REF (from MEM [p_3, 17] for example).  As we are interested in
pointers this could be handled by not requiring a memory reference
but extracting the base address and offset, covering more cases.


I tried the attached patch, and it almost worked, except for one fortran 
testcase (widechar_intrinsics_10.f90):


Now that ao_ref_init_from_ptr_and_size has been fixed, the following patch 
passes bootstrap+testsuite (default languages) on 
x86_64-unknown-linux-gnu.


2013-11-06  Marc Glisse  marc.gli...@inria.fr
Jeff Law  l...@redhat.com

gcc/
* tree-ssa-alias.c (stmt_kills_ref_p_1): Use
ao_ref_init_from_ptr_and_size for builtins.

gcc/testsuite/
* gcc.dg/tree-ssa/alias-27.c: New testcase.

--
Marc GlisseIndex: gcc/testsuite/gcc.dg/tree-ssa/alias-27.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-27.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-27.c(working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options -O1 -fdump-tree-optimized } */
+
+void f (long *p) {
+  *p = 42;
+  p[4] = 42;
+  __builtin_memset (p, 0, 100);
+}
+
+/* { dg-final { scan-tree-dump-not = 42 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-27.c
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 204453)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -2001,23 +2001,24 @@ stmt_may_clobber_ref_p (gimple stmt, tre
   return stmt_may_clobber_ref_p_1 (stmt, r);
 }
 
 /* If STMT kills the memory reference REF return true, otherwise
return false.  */
 
 static bool
 stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
 {
   /* For a must-alias check we need to be able to constrain
- the access properly.  */
-  ao_ref_base (ref);
-  if (ref-max_size == -1)
+ the access properly.
+ FIXME: except for BUILTIN_FREE.  */
+  if (!ao_ref_base (ref)
+  || ref-max_size == -1)
 return false;
 
   if (gimple_has_lhs (stmt)
TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME
   /* The assignment is not necessarily carried out if it can throw
 and we can catch it in the current function where we could inspect
 the previous value.
 ???  We only need to care about the RHS throwing.  For aggregate
 assignments or similar calls and non-call exceptions the LHS
 might throw as well.  */
@@ -2090,37 +2091,47 @@ stmt_kills_ref_p_1 (gimple stmt, ao_ref
  case BUILT_IN_MEMPCPY:
  case BUILT_IN_MEMMOVE:
  case BUILT_IN_MEMSET:
  case BUILT_IN_MEMCPY_CHK:
  case BUILT_IN_MEMPCPY_CHK:
  case BUILT_IN_MEMMOVE_CHK:
  case BUILT_IN_MEMSET_CHK:
{
  tree dest = gimple_call_arg (stmt, 0);
  tree len = gimple_call_arg (stmt, 2);
- tree base = NULL_TREE;
- HOST_WIDE_INT offset = 0;
+ tree rbase = ref-base;
+ HOST_WIDE_INT roffset = ref-offset;
  if (!host_integerp (len, 0))
return false;
- if (TREE_CODE (dest) == ADDR_EXPR)
-   base = get_addr_base_and_unit_offset (TREE_OPERAND (dest, 0),
- offset);
- else if (TREE_CODE (dest) == SSA_NAME)
-   base = dest;
- if (base
-  base == ao_ref_base (ref))
+ ao_ref dref;
+ ao_ref_init_from_ptr_and_size (dref, dest, len);
+ tree base = ao_ref_base (dref);
+ HOST_WIDE_INT offset = dref.offset;
+ if (!base || dref.size == -1)
+   return false;
+ if (TREE_CODE (base) == MEM_REF)
+   {
+ if (TREE_CODE (rbase) != MEM_REF)
+   return false;
+ // Compare pointers.
+ offset += BITS_PER_UNIT
+   * TREE_INT_CST_LOW (TREE_OPERAND (base, 1));
+ roffset += BITS_PER_UNIT
+* TREE_INT_CST_LOW (TREE_OPERAND (rbase, 1));
+ base = TREE_OPERAND (base, 0);
+ rbase = TREE_OPERAND (rbase, 0);
+   }
+ if (base == rbase)
{
- HOST_WIDE_INT size = TREE_INT_CST_LOW (len);
- if (offset = ref-offset / BITS_PER_UNIT
-  

Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Bernd Schmidt
On 11/06/2013 12:31 PM, Richard Biener wrote:
 That I agree to.  Instead of fixing the less than optimal separation / 
 boundary
 between frontends and the middle-end, or fixing several other long-standing
 issues with GCC we spend a lot of time refactoring things to be C++.
 But that was kind of part of the decision (though I remember that we
 mainly wanted to convert containters and isolated stuff, not gimple
 or trees (I bet that'll be next)).

What I seem to remember is being told that we'd use C++ mostly for new
code and not engage in a wholesale rewrite of the compiler.

Instead things happened exactly as predicted, we're getting immense
amounts of churn for little real gain, and in some cases the C++ changes
are not even improvements and look like really strange examples of C++
code. I'll offer as evidence the pass manager with its bizarre has_gate
and has_execute variables, member functions for gate and execute
wrapping non-member functions, and a separate pass_data outside of its
class (not even as a static member).

Now that the question of what would happen after a C++ conversion is no
longer theoretical but has been answered, can we maybe just stop and
think for a while whether this is really the way we should be going?


Bernd



Re: mismatched decls w/ both builtin and explicit decl

2013-11-06 Thread Joseph S. Myers
On Tue, 5 Nov 2013, DJ Delorie wrote:

 
 Consider this source:
 
   extern char *index(const char *,int);
   static int index;
 
 index is a builtin as well, but because it's a builtin gcc skips the
 previous declaration was here... despite having *a* previous decl it
 could complain about.  Note that newlib provides decls for many
 builtins (the decl above is from newlib), so this could be a common
 case.
 
 So I added a check for !C_DECL_DECLARED_BUILTIN (decl) which seems to
 specifically cover this case.  Ok to apply?

Please send a patch that adds a testcase to the testsuite to show the 
diagnostics you get after the patch.

-- 
Joseph S. Myers
jos...@codesourcery.com


[SH] arith_reg_operand vs. fp_arith_reg_operand

2013-11-06 Thread Oleg Endo
Hello,

The attached patch addresses one thing which I mentioned in PR 22553
comment #27.  Some of the FP insns take arith_reg_operand, which seems
a bit off.  I haven't checked whether it fixes the original problem of
PR 22553, but anyway it's probably better to use fp_arith_reg_operand
for FP insns.

Tested on rev 204263 with
make -k check RUNTESTFLAGS=--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4/-mb,-m4-single/
-ml,-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}

and no new failures.
OK for trunk?

Cheers,
Oleg

gcc/ChangeLog:

* config/sh/sh.md (addsf3, divsf3, divsf3_i, rsqrtsf2, 
cmpgtdf_t, cmpeqdf_t, *ieee_ccmpeqdf_t, negdf2, sqrtdf2, 
absdf2): Use fp_arith_reg_operand instead of arith_reg_operand.
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 204263)
+++ gcc/config/sh/sh.md	(working copy)
@@ -12187,9 +12187,9 @@
   [(set_attr type fpscr_toggle)])
 
 (define_expand addsf3
-  [(set (match_operand:SF 0 arith_reg_operand )
-	(plus:SF (match_operand:SF 1 arith_reg_operand )
-		 (match_operand:SF 2 arith_reg_operand )))]
+  [(set (match_operand:SF 0 fp_arith_reg_operand)
+	(plus:SF (match_operand:SF 1 fp_arith_reg_operand)
+		 (match_operand:SF 2 fp_arith_reg_operand)))]
   TARGET_SH2E || TARGET_SHMEDIA_FPU
 {
   if (TARGET_SH2E)
@@ -12436,9 +12436,9 @@
   [(set_attr type fparith_media)])
 
 (define_expand divsf3
-  [(set (match_operand:SF 0 arith_reg_operand )
-	(div:SF (match_operand:SF 1 arith_reg_operand )
-		(match_operand:SF 2 arith_reg_operand )))]
+  [(set (match_operand:SF 0 fp_arith_reg_operand)
+	(div:SF (match_operand:SF 1 fp_arith_reg_operand)
+		(match_operand:SF 2 fp_arith_reg_operand)))]
   TARGET_SH2E || TARGET_SHMEDIA_FPU
 {
   if (TARGET_SH2E)
@@ -12457,9 +12457,9 @@
   [(set_attr type fdiv_media)])
 
 (define_insn divsf3_i
-  [(set (match_operand:SF 0 arith_reg_dest =f)
-	(div:SF (match_operand:SF 1 arith_reg_operand 0)
-		 (match_operand:SF 2 arith_reg_operand f)))
+  [(set (match_operand:SF 0 fp_arith_reg_dest =f)
+	(div:SF (match_operand:SF 1 fp_arith_reg_operand 0)
+		 (match_operand:SF 2 fp_arith_reg_operand f)))
(use (match_operand:PSI 3 fpscr_operand c))]
   TARGET_SH2E
   fdiv	%2,%0
@@ -12743,9 +12743,9 @@
(set_attr fp_mode single)])
 
 (define_insn rsqrtsf2
-  [(set (match_operand:SF 0 register_operand =f)
+  [(set (match_operand:SF 0 fp_arith_reg_operand =f)
 	(div:SF (match_operand:SF 1 immediate_operand i)
-		(sqrt:SF (match_operand:SF 2 register_operand 0
+		(sqrt:SF (match_operand:SF 2 fp_arith_reg_operand 0
(use (match_operand:PSI 3 fpscr_operand c))]
   TARGET_FPU_ANY  TARGET_FSRRA
 operands[1] == CONST1_RTX (SFmode)
@@ -13059,8 +13059,8 @@
 
 (define_insn cmpgtdf_t
   [(set (reg:SI T_REG)
-	(gt:SI (match_operand:DF 0 arith_reg_operand f)
-	   (match_operand:DF 1 arith_reg_operand f)))
+	(gt:SI (match_operand:DF 0 fp_arith_reg_operand f)
+	   (match_operand:DF 1 fp_arith_reg_operand f)))
(use (match_operand:PSI 2 fpscr_operand c))]
   (TARGET_SH4 || TARGET_SH2A_DOUBLE)
   fcmp/gt	%1,%0
@@ -13069,8 +13069,8 @@
 
 (define_insn cmpeqdf_t
   [(set (reg:SI T_REG)
-	(eq:SI (match_operand:DF 0 arith_reg_operand f)
-	   (match_operand:DF 1 arith_reg_operand f)))
+	(eq:SI (match_operand:DF 0 fp_arith_reg_operand f)
+	   (match_operand:DF 1 fp_arith_reg_operand f)))
(use (match_operand:PSI 2 fpscr_operand c))]
   (TARGET_SH4 || TARGET_SH2A_DOUBLE)
   fcmp/eq	%1,%0
@@ -13080,8 +13080,8 @@
 (define_insn *ieee_ccmpeqdf_t
   [(set (reg:SI T_REG)
 	(ior:SI (reg:SI T_REG)
-		(eq:SI (match_operand:DF 0 arith_reg_operand f)
-		   (match_operand:DF 1 arith_reg_operand f
+		(eq:SI (match_operand:DF 0 fp_arith_reg_operand f)
+		   (match_operand:DF 1 fp_arith_reg_operand f
(use (match_operand:PSI 2 fpscr_operand c))]
   TARGET_IEEE  (TARGET_SH4 || TARGET_SH2A_DOUBLE)
 {
@@ -13139,10 +13139,9 @@
   DONE;
 })
 
-
 (define_expand negdf2
-  [(set (match_operand:DF 0 arith_reg_operand )
-	(neg:DF (match_operand:DF 1 arith_reg_operand )))]
+  [(set (match_operand:DF 0 fp_arith_reg_operand)
+	(neg:DF (match_operand:DF 1 fp_arith_reg_operand)))]
   (TARGET_SH4 || TARGET_SH2A_DOUBLE) || TARGET_SHMEDIA_FPU
 {
   if (TARGET_SH4 || TARGET_SH2A_DOUBLE)
@@ -13169,8 +13168,8 @@
(set_attr fp_mode double)])
 
 (define_expand sqrtdf2
-  [(set (match_operand:DF 0 arith_reg_operand )
-	(sqrt:DF (match_operand:DF 1 arith_reg_operand )))]
+  [(set (match_operand:DF 0 fp_arith_reg_operand)
+	(sqrt:DF (match_operand:DF 1 fp_arith_reg_operand)))]
   (TARGET_SH4 || TARGET_SH2A_DOUBLE) || TARGET_SHMEDIA_FPU
 {
   if (TARGET_SH4 || TARGET_SH2A_DOUBLE)
@@ -13197,8 +13196,8 @@
(set_attr fp_mode double)])
 
 (define_expand absdf2
-  [(set (match_operand:DF 0 arith_reg_operand )
-	(abs:DF (match_operand:DF 1 arith_reg_operand )))]
+  [(set (match_operand:DF 0 fp_arith_reg_operand)
+	(abs:DF (match_operand:DF 

Re: [C] Fix PR57258: unused variable warning is emitted for volatile variables

2013-11-06 Thread Joseph S. Myers
On Wed, 6 Nov 2013, Mingjie Xing wrote:

 * c/c-decl.c (pop_scope): Skip volatile variables while emit
 warnings for unused variables.

c/ has its own ChangeLog, so no c/ in the ChangeLog entries.

This patch doesn't include a testsuite addition.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Patch Ada/build] deal with some cross/native cross issues

2013-11-06 Thread Eric Botcazou
 I've been trying to improve the building and testing of Darwin for crosses
 and native crosses.
 
 This has thrown up a few small glitches in the Ada build stuff (that would
 seem to apply to any build-host-nativeX scenario, not just darwin).  I
 would imagine it would be beneficial to resolve these, since Ada requires
 Ada - a native cross seems like the only realistic way onto a new target.

Thanks for working on this.

 1. xgnatugn needs to be run on the build system, so needs to be built with
 the build system's gnatmake. I haven't put a canonical prefix on this since
 this doesn't appear to be done elsewhere. Defined as GNATMAKE_FOR_BUILD and
 passed to sub-processes.

Why do you need to pass it to ADA_TOOLS_FLAGS_TO_PASS though?  Just replace 
$(GNATMAKE) with gnatmake.

 2. Some builds might need to pass LDFLAGS to the gnat* builds.  Appended
 LDFLAGS to GCC_LINK.  Passed on in gnattools/Make.

OK.

 3. In gnattools, the RTS dir must be for the host and not for the build; 
 This actually only showed up when I tried a cross from a 64bit pointer
 machine to a 32bit pointer one (i.e it is easy for it to go unnoticed).

OK, but don't you need to do the same for gnatmake/gnatbind/gnatlink here?
See gcc-interface/Make-lang.in, line 171 and below, for similar code.

-- 
Eric Botcazou


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 02:00:56PM +0100, Richard Biener wrote:
 Well, what else, besides as_a or keeping the current
 global accessor functions would you propose?

I'd prefer to keep the current accessors.  Whether there is a class
hierarchy or we keep unions doesn't affect the code too much (unless
it makes debugging experience harder, as numerous past C++ification changes
have done :( ).

Jakub


Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 12:59 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:00 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:24 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 1:52 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:

 For input parameter P I need to have
   BOUNDS = __builtin_arg_bnd (P)
 to somehow refer to bounds of P in GIMPLE.  Optimizations may modify
 __builtin_arg_bnd (P) replacing P with its copy or some value. It
 makes call useless because removes information about parameter whose
 bounds we refer to. I want such optimization to ignore
 __builtin_arg_bnd calls and always leave default SSA_NAME of PARM_DECL
 there as arg.

 How does a compilable testcase look like that shows how the default def
 is used?  And what transforms break that use?  I really cannot see
 how this would happen (and you didn't add a testcase).

 Here is a test source:

 extern int bar1 (int *p);
 extern int bar2 (int);

 int foo (int *p)
 {
   if (!p)
 return bar1 (p);
   return bar2 (10);
 }

 After instrumentation GIMPLE looks like:

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 6:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (p_3(D));

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   _6 = bar1 (p_3(D), __bound_tmp.0_9);
   goto bb 5;

   bb 4:
   _8 = bar2 (10);

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Here is optimized GIMPLE (if I do not apply my changes in tree-ssa-dom.c):

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (0B); [return slot 
 optimization]
   _6 = bar1 (0B, __bound_tmp.0_9); [tail call]
   goto bb 5;

   bb 4:
   _8 = bar2 (10); [tail call]

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Now during expand or inline of foo I cannot determine the value for
 __bound_tmp.0_9.

 Well, but clearly you can see that bounds for p == NULL are
 useless and thus there is no problem with the transform.

 This example just demonstrates the issue. I may use some var's address
 in comparison with the similar result. And value does not always allow
 to determine bounds, because pointers with the same value may have
 different bounds.

 Some vars address can be used for better bounds deriving though.
 Well, it clearly shows it's a get me good results issue and not
 a correctness issue.  And for good results you trade in missed
 general optimizations.  That doesn't sound very good to me.

 That is definitely a correctness issue.

Then please add a runtime testcase that fails before and passes after
your patch.

 Compiler cannot assume bounds
 by pointer value ignoring bounds passed to the function.

But it can certainly drop bounds to unknown?  You certainly cannot
rely on no such transform taking place.  Maybe this just shows that
your refering to the parameter is done wrong and shouldn't have
used the SSA name default def.  What do you do for parameters
that have their address taken btw?  You'll get

foo (void *p)
{
   p_2 = *p;
   __builtin_ia32_arg_bnd (p_2);

which isn't desired?

Hmm, unfortunately on x86_64 I get

./cc1 -quiet t.c -fcheck-pointer-bounds
t.c:1:0: error: -fcheck-pointers is not supported for this target
 extern int bar1 (int *p);
 ^

trying a simple testcase.  Grepping shows me no target supports
chkp_bound_mode ..?  What's this stuff in our tree that has no way
of testing it ... :/

Richard.

 SSA_NAME_OCCURS_IN_ABNORMAL_PHI usage results in missed optimizations.
  My original patch does not.

 Ilya


 Richard.


 Also sth weird is going on as your arg_bnd call gets [return slot 
 optimization].
 It seems that your POINTER_BOUNDS_TYPE is seen as aggregate somehow.

 Target is responsible for having return slot optimization here. And
 target expands all such calls. So, do not see the problem here.

 Ilya


 Richard.

 Ilya

 Richard.



Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Joseph S. Myers
On Wed, 6 Nov 2013, Jakub Jelinek wrote:

 On Wed, Nov 06, 2013 at 12:31:00PM +0100, Richard Biener wrote:
   Maybe we need to revisit it? As one of those who were not in favour of
   the C++ move, can I ask you guys to step back for a moment and think
   about - what do all of these changes buy us, exactly? Imagine the state
   at the end, where everything is converted and supposedly the temporary
   ugliness is gone, what have we gained over the code as it is now?
  
  as_a gains us less runtime checking and more static type checking
  which is good.
 
 But that really affects --enable-checking=yes builds (and only cases where
 things aren't inlined).  If the price for that is uglier and less readable
 code, then the price is just too high.

I see static type checking as being about preventing certain sorts of bugs 
getting in the compiler at all, rather than about saving time on runtime 
checks for --enable-checking=yes.

I don't have advice on the gimple case.  But certainly trees and RTL could 
both do with more static typing.  There, I'm thinking of different types 
of objects (types, expressions, ...) logically being completely separate 
static types, not inheriting from a common tree base class at all.  But 
if inheritance and as_a can be used as intermediate steps to allow an 
incremental transition to completely separate static types, where 
otherwise the whole source tree would need converting at once, that seems 
reasonable to me. (I have no comments about what things should look like 
in cases where completely separate static types don't make sense but there 
is still some sort of inheritance structure.)

-- 
Joseph S. Myers
jos...@codesourcery.com


[Patch, RTL] Eliminate redundant vec_select moves.

2013-11-06 Thread Tejas Belagod


Hi,

The attached patch eliminates moves of the form

set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

i.e. eliminates lower lane moves between src and dst where src and dst are the 
same register and this causes rtl to instead use the destination register in the 
required mode.


Also, if my understanding of Big-Endian is correct, this should be safe for 
big-endian targets as well.


I've bootstrapped this on x64_64, regressed on aarch64-none-elf, 
aarch64_be-none-elf.


OK for trunk?

Thanks,
Tejas Belagod
ARM.

2013-11-06  Tejas Belagod  tejas.bela...@arm.com

gcc/
* rtlanal.c (set_noop_p): Return nonzero in case of redundant vec_select
for same src and dst.diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 9769b69..3e434cd 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -1180,6 +1180,25 @@ set_noop_p (const_rtx set)
   dst = SUBREG_REG (dst);
 }
 
+  /* This is big-endian-safe because the elements are kept in target
+ memory order.  So, for eg. PARALLEL element value of 2 is the same in
+ either endian-ness.  */
+  if (GET_CODE (src) == VEC_SELECT
+   REG_P (XEXP (src, 0))  REG_P (dst)
+   REGNO (XEXP (src, 0)) == REGNO (dst))
+{
+  rtx par = XEXP (src, 1);
+  int i;
+
+  for (i = 0; i  XVECLEN (par, 0); i++)
+   {
+ rtx tem = XVECEXP (par, 0, i);
+ if (!CONST_INT_P (tem) || INTVAL (tem) != i)
+   return 0;
+   }
+  return 1;
+}
+
   return (REG_P (src)  REG_P (dst)
   REGNO (src) == REGNO (dst));
 }

Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Ilya Enkovich
2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:59 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:00 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:24 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 1:52 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:

 For input parameter P I need to have
   BOUNDS = __builtin_arg_bnd (P)
 to somehow refer to bounds of P in GIMPLE.  Optimizations may modify
 __builtin_arg_bnd (P) replacing P with its copy or some value. It
 makes call useless because removes information about parameter whose
 bounds we refer to. I want such optimization to ignore
 __builtin_arg_bnd calls and always leave default SSA_NAME of PARM_DECL
 there as arg.

 How does a compilable testcase look like that shows how the default def
 is used?  And what transforms break that use?  I really cannot see
 how this would happen (and you didn't add a testcase).

 Here is a test source:

 extern int bar1 (int *p);
 extern int bar2 (int);

 int foo (int *p)
 {
   if (!p)
 return bar1 (p);
   return bar2 (10);
 }

 After instrumentation GIMPLE looks like:

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 6:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (p_3(D));

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   _6 = bar1 (p_3(D), __bound_tmp.0_9);
   goto bb 5;

   bb 4:
   _8 = bar2 (10);

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Here is optimized GIMPLE (if I do not apply my changes in 
 tree-ssa-dom.c):

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (0B); [return slot 
 optimization]
   _6 = bar1 (0B, __bound_tmp.0_9); [tail call]
   goto bb 5;

   bb 4:
   _8 = bar2 (10); [tail call]

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Now during expand or inline of foo I cannot determine the value for
 __bound_tmp.0_9.

 Well, but clearly you can see that bounds for p == NULL are
 useless and thus there is no problem with the transform.

 This example just demonstrates the issue. I may use some var's address
 in comparison with the similar result. And value does not always allow
 to determine bounds, because pointers with the same value may have
 different bounds.

 Some vars address can be used for better bounds deriving though.
 Well, it clearly shows it's a get me good results issue and not
 a correctness issue.  And for good results you trade in missed
 general optimizations.  That doesn't sound very good to me.

 That is definitely a correctness issue.

 Then please add a runtime testcase that fails before and passes after
 your patch.

 Compiler cannot assume bounds
 by pointer value ignoring bounds passed to the function.

 But it can certainly drop bounds to unknown?  You certainly cannot
 rely on no such transform taking place.  Maybe this just shows that
 your refering to the parameter is done wrong and shouldn't have
 used the SSA name default def.  What do you do for parameters
 that have their address taken btw?  You'll get

 foo (void *p)
 {
p_2 = *p;
__builtin_ia32_arg_bnd (p_2);

 which isn't desired?

In this case you just make a load and bounds for p_2 are obtained
using bndldx call. I would be glad to use better way to refer to the
argument but I do not see one.


 Hmm, unfortunately on x86_64 I get

 ./cc1 -quiet t.c -fcheck-pointer-bounds
 t.c:1:0: error: -fcheck-pointers is not supported for this target
  extern int bar1 (int *p);
  ^

 trying a simple testcase.  Grepping shows me no target supports
 chkp_bound_mode ..?  What's this stuff in our tree that has no way
 of testing it ... :/

You should pass -mmpx to enable it on the target. Current version in
the mpx tree should build most of tests correctly but it is not
updated to the current state.

Ilya


 Richard.

 SSA_NAME_OCCURS_IN_ABNORMAL_PHI usage results in missed optimizations.
  My original patch does not.

 Ilya


 Richard.


 Also sth weird is going on as your arg_bnd call gets [return slot 
 optimization].
 It seems that your POINTER_BOUNDS_TYPE is seen as aggregate somehow.

 Target is responsible for having return slot optimization here. And
 target expands all such calls. So, do not see the problem here.

 Ilya


 Richard.

 Ilya

 Richard.



Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 2:31 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:59 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:00 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:24 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 1:52 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:

 For input parameter P I need to have
   BOUNDS = __builtin_arg_bnd (P)
 to somehow refer to bounds of P in GIMPLE.  Optimizations may modify
 __builtin_arg_bnd (P) replacing P with its copy or some value. It
 makes call useless because removes information about parameter whose
 bounds we refer to. I want such optimization to ignore
 __builtin_arg_bnd calls and always leave default SSA_NAME of PARM_DECL
 there as arg.

 How does a compilable testcase look like that shows how the default def
 is used?  And what transforms break that use?  I really cannot see
 how this would happen (and you didn't add a testcase).

 Here is a test source:

 extern int bar1 (int *p);
 extern int bar2 (int);

 int foo (int *p)
 {
   if (!p)
 return bar1 (p);
   return bar2 (10);
 }

 After instrumentation GIMPLE looks like:

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 6:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (p_3(D));

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   _6 = bar1 (p_3(D), __bound_tmp.0_9);
   goto bb 5;

   bb 4:
   _8 = bar2 (10);

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Here is optimized GIMPLE (if I do not apply my changes in 
 tree-ssa-dom.c):

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (0B); [return slot 
 optimization]
   _6 = bar1 (0B, __bound_tmp.0_9); [tail call]
   goto bb 5;

   bb 4:
   _8 = bar2 (10); [tail call]

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Now during expand or inline of foo I cannot determine the value for
 __bound_tmp.0_9.

 Well, but clearly you can see that bounds for p == NULL are
 useless and thus there is no problem with the transform.

 This example just demonstrates the issue. I may use some var's address
 in comparison with the similar result. And value does not always allow
 to determine bounds, because pointers with the same value may have
 different bounds.

 Some vars address can be used for better bounds deriving though.
 Well, it clearly shows it's a get me good results issue and not
 a correctness issue.  And for good results you trade in missed
 general optimizations.  That doesn't sound very good to me.

 That is definitely a correctness issue.

 Then please add a runtime testcase that fails before and passes after
 your patch.

 Compiler cannot assume bounds
 by pointer value ignoring bounds passed to the function.

 But it can certainly drop bounds to unknown?  You certainly cannot
 rely on no such transform taking place.  Maybe this just shows that
 your refering to the parameter is done wrong and shouldn't have
 used the SSA name default def.  What do you do for parameters
 that have their address taken btw?  You'll get

 foo (void *p)
 {
p_2 = *p;
__builtin_ia32_arg_bnd (p_2);

 which isn't desired?

 In this case you just make a load and bounds for p_2 are obtained
 using bndldx call. I would be glad to use better way to refer to the
 argument but I do not see one.

Err, my mistake - I meant

foo (void *p)
{
  p_2 = p;
  __builtin_ia32_arg_bnd (p_2);


 Hmm, unfortunately on x86_64 I get

 ./cc1 -quiet t.c -fcheck-pointer-bounds
 t.c:1:0: error: -fcheck-pointers is not supported for this target
  extern int bar1 (int *p);
  ^

 trying a simple testcase.  Grepping shows me no target supports
 chkp_bound_mode ..?  What's this stuff in our tree that has no way
 of testing it ... :/

 You should pass -mmpx to enable it on the target. Current version in
 the mpx tree should build most of tests correctly but it is not
 updated to the current state.

 ./cc1 -quiet t.c -fcheck-pointer-bounds -mmpx
t.c:1:0: error: -fcheck-pointers is not supported for this target
 extern int bar1 (int *p);
 ^

hmm, maybe it's configure disabled somehow (this is just my dev tree).
But I still see no chkp_bound_mode implementation.

Note that POINTER_BOUNDS_TYPE shouldn't have been a new
tree code either but should be a RECORD_TYPE.  See how
va_list doesn't use a new TREE_CODE either.

Richard.

 Ilya


 Richard.

 SSA_NAME_OCCURS_IN_ABNORMAL_PHI usage results in missed optimizations.
  My original patch does not.

 Ilya


 Richard.


 Also sth weird is going on as your arg_bnd call gets [return slot 
 

Re: [RFA][PATCH] Minor fix to aliasing machinery

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 1:19 PM, Marc Glisse marc.gli...@inria.fr wrote:
 [Discussion started in
 http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02472.html ]


 On Wed, 30 Oct 2013, Marc Glisse wrote:

 On Wed, 30 Oct 2013, Richard Biener wrote:

 Btw, get_addr_base_and_unit_offset may also return an offsetted
 MEM_REF (from MEM [p_3, 17] for example).  As we are interested in
 pointers this could be handled by not requiring a memory reference
 but extracting the base address and offset, covering more cases.


 I tried the attached patch, and it almost worked, except for one fortran
 testcase (widechar_intrinsics_10.f90):


 Now that ao_ref_init_from_ptr_and_size has been fixed, the following patch
 passes bootstrap+testsuite (default languages) on x86_64-unknown-linux-gnu.

 2013-11-06  Marc Glisse  marc.gli...@inria.fr
 Jeff Law  l...@redhat.com

 gcc/
 * tree-ssa-alias.c (stmt_kills_ref_p_1): Use
 ao_ref_init_from_ptr_and_size for builtins.

 gcc/testsuite/
 * gcc.dg/tree-ssa/alias-27.c: New testcase.

 --
 Marc Glisse
 Index: gcc/testsuite/gcc.dg/tree-ssa/alias-27.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/alias-27.c(revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/alias-27.c(working copy)
 @@ -0,0 +1,11 @@
 +/* { dg-do compile } */
 +/* { dg-options -O1 -fdump-tree-optimized } */
 +
 +void f (long *p) {
 +  *p = 42;
 +  p[4] = 42;
 +  __builtin_memset (p, 0, 100);
 +}
 +
 +/* { dg-final { scan-tree-dump-not = 42 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */

 Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-27.c
 ___
 Added: svn:keywords
 ## -0,0 +1 ##
 +Author Date Id Revision URL
 \ No newline at end of property
 Added: svn:eol-style
 ## -0,0 +1 ##
 +native
 \ No newline at end of property
 Index: gcc/tree-ssa-alias.c
 ===
 --- gcc/tree-ssa-alias.c(revision 204453)
 +++ gcc/tree-ssa-alias.c(working copy)
 @@ -2001,23 +2001,24 @@ stmt_may_clobber_ref_p (gimple stmt, tre
return stmt_may_clobber_ref_p_1 (stmt, r);
  }

  /* If STMT kills the memory reference REF return true, otherwise
 return false.  */

  static bool
  stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
  {
/* For a must-alias check we need to be able to constrain
 - the access properly.  */
 -  ao_ref_base (ref);
 -  if (ref-max_size == -1)
 + the access properly.
 + FIXME: except for BUILTIN_FREE.  */
 +  if (!ao_ref_base (ref)
 +  || ref-max_size == -1)
  return false;

if (gimple_has_lhs (stmt)
 TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME
/* The assignment is not necessarily carried out if it can throw
  and we can catch it in the current function where we could inspect
  the previous value.
  ???  We only need to care about the RHS throwing.  For aggregate
  assignments or similar calls and non-call exceptions the LHS
  might throw as well.  */
 @@ -2090,37 +2091,47 @@ stmt_kills_ref_p_1 (gimple stmt, ao_ref
   case BUILT_IN_MEMPCPY:
   case BUILT_IN_MEMMOVE:
   case BUILT_IN_MEMSET:
   case BUILT_IN_MEMCPY_CHK:
   case BUILT_IN_MEMPCPY_CHK:
   case BUILT_IN_MEMMOVE_CHK:
   case BUILT_IN_MEMSET_CHK:
 {
   tree dest = gimple_call_arg (stmt, 0);
   tree len = gimple_call_arg (stmt, 2);
 - tree base = NULL_TREE;
 - HOST_WIDE_INT offset = 0;
 + tree rbase = ref-base;
 + HOST_WIDE_INT roffset = ref-offset;
   if (!host_integerp (len, 0))
 return false;
 - if (TREE_CODE (dest) == ADDR_EXPR)
 -   base = get_addr_base_and_unit_offset (TREE_OPERAND (dest,
 0),
 - offset);
 - else if (TREE_CODE (dest) == SSA_NAME)
 -   base = dest;
 - if (base
 -  base == ao_ref_base (ref))
 + ao_ref dref;
 + ao_ref_init_from_ptr_and_size (dref, dest, len);

What I dislike about this is that it can end up building a new tree node.
Not sure if that should block the patch though as it clearly allows to
simplify the code ...

 + tree base = ao_ref_base (dref);
 + HOST_WIDE_INT offset = dref.offset;
 + if (!base || dref.size == -1)
 +   return false;
 + if (TREE_CODE (base) == MEM_REF)
 +   {
 + if (TREE_CODE (rbase) != MEM_REF)

why's that?  I think that just processing both bases separately
would work as well.

 +   return false;
 + // Compare pointers.
 + offset += BITS_PER_UNIT
 +   * TREE_INT_CST_LOW (TREE_OPERAND (base, 

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 2:24 PM, Tejas Belagod tbela...@arm.com wrote:

 Hi,

 The attached patch eliminates moves of the form

 set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

 i.e. eliminates lower lane moves between src and dst where src and dst are
 the same register and this causes rtl to instead use the destination
 register in the required mode.

 Also, if my understanding of Big-Endian is correct, this should be safe for
 big-endian targets as well.

 I've bootstrapped this on x64_64, regressed on aarch64-none-elf,
 aarch64_be-none-elf.

 OK for trunk?

It looks good to me (but I'm also wondering about bigendian).  Bill?

Can you add a testcase where this has an effect on code generation?

Thanks,
Richard.

 Thanks,
 Tejas Belagod
 ARM.

 2013-11-06  Tejas Belagod  tejas.bela...@arm.com

 gcc/
 * rtlanal.c (set_noop_p): Return nonzero in case of redundant
 vec_select
 for same src and dst.
 diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
 index 9769b69..3e434cd 100644
 --- a/gcc/rtlanal.c
 +++ b/gcc/rtlanal.c
 @@ -1180,6 +1180,25 @@ set_noop_p (const_rtx set)
dst = SUBREG_REG (dst);
  }

 +  /* This is big-endian-safe because the elements are kept in target
 + memory order.  So, for eg. PARALLEL element value of 2 is the same in
 + either endian-ness.  */
 +  if (GET_CODE (src) == VEC_SELECT
 +   REG_P (XEXP (src, 0))  REG_P (dst)
 +   REGNO (XEXP (src, 0)) == REGNO (dst))
 +{
 +  rtx par = XEXP (src, 1);
 +  int i;
 +
 +  for (i = 0; i  XVECLEN (par, 0); i++)
 +   {
 + rtx tem = XVECEXP (par, 0, i);
 + if (!CONST_INT_P (tem) || INTVAL (tem) != i)
 +   return 0;
 +   }
 +  return 1;
 +}
 +
return (REG_P (src)  REG_P (dst)
REGNO (src) == REGNO (dst));
  }


Re: [SH] arith_reg_operand vs. fp_arith_reg_operand

2013-11-06 Thread Kaz Kojima
Oleg Endo oleg.e...@t-online.de wrote:
 The attached patch addresses one thing which I mentioned in PR 22553
 comment #27.  Some of the FP insns take arith_reg_operand, which seems
 a bit off.  I haven't checked whether it fixes the original problem of
 PR 22553, but anyway it's probably better to use fp_arith_reg_operand
 for FP insns.
 
 Tested on rev 204263 with
 make -k check RUNTESTFLAGS=--target_board=sh-sim
 \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4/-mb,-m4-single/
 -ml,-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}
 
 and no new failures.
 OK for trunk?

OK.

Regards,
kaz


Re: [RFA][PATCH] Minor fix to aliasing machinery

2013-11-06 Thread Marc Glisse

On Wed, 6 Nov 2013, Richard Biener wrote:


On Wed, Nov 6, 2013 at 1:19 PM, Marc Glisse marc.gli...@inria.fr wrote:

[Discussion started in
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02472.html ]


On Wed, 30 Oct 2013, Marc Glisse wrote:


On Wed, 30 Oct 2013, Richard Biener wrote:


Btw, get_addr_base_and_unit_offset may also return an offsetted
MEM_REF (from MEM [p_3, 17] for example).  As we are interested in
pointers this could be handled by not requiring a memory reference
but extracting the base address and offset, covering more cases.



I tried the attached patch, and it almost worked, except for one fortran
testcase (widechar_intrinsics_10.f90):



Now that ao_ref_init_from_ptr_and_size has been fixed, the following patch
passes bootstrap+testsuite (default languages) on x86_64-unknown-linux-gnu.

2013-11-06  Marc Glisse  marc.gli...@inria.fr
Jeff Law  l...@redhat.com

gcc/
* tree-ssa-alias.c (stmt_kills_ref_p_1): Use
ao_ref_init_from_ptr_and_size for builtins.

gcc/testsuite/
* gcc.dg/tree-ssa/alias-27.c: New testcase.

--
Marc Glisse
Index: gcc/testsuite/gcc.dg/tree-ssa/alias-27.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/alias-27.c(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/alias-27.c(working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options -O1 -fdump-tree-optimized } */
+
+void f (long *p) {
+  *p = 42;
+  p[4] = 42;
+  __builtin_memset (p, 0, 100);
+}
+
+/* { dg-final { scan-tree-dump-not = 42 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */

Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-27.c
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c(revision 204453)
+++ gcc/tree-ssa-alias.c(working copy)
@@ -2001,23 +2001,24 @@ stmt_may_clobber_ref_p (gimple stmt, tre
   return stmt_may_clobber_ref_p_1 (stmt, r);
 }

 /* If STMT kills the memory reference REF return true, otherwise
return false.  */

 static bool
 stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
 {
   /* For a must-alias check we need to be able to constrain
- the access properly.  */
-  ao_ref_base (ref);
-  if (ref-max_size == -1)
+ the access properly.
+ FIXME: except for BUILTIN_FREE.  */
+  if (!ao_ref_base (ref)
+  || ref-max_size == -1)
 return false;

   if (gimple_has_lhs (stmt)
TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME
   /* The assignment is not necessarily carried out if it can throw
 and we can catch it in the current function where we could inspect
 the previous value.
 ???  We only need to care about the RHS throwing.  For aggregate
 assignments or similar calls and non-call exceptions the LHS
 might throw as well.  */
@@ -2090,37 +2091,47 @@ stmt_kills_ref_p_1 (gimple stmt, ao_ref
  case BUILT_IN_MEMPCPY:
  case BUILT_IN_MEMMOVE:
  case BUILT_IN_MEMSET:
  case BUILT_IN_MEMCPY_CHK:
  case BUILT_IN_MEMPCPY_CHK:
  case BUILT_IN_MEMMOVE_CHK:
  case BUILT_IN_MEMSET_CHK:
{
  tree dest = gimple_call_arg (stmt, 0);
  tree len = gimple_call_arg (stmt, 2);
- tree base = NULL_TREE;
- HOST_WIDE_INT offset = 0;
+ tree rbase = ref-base;
+ HOST_WIDE_INT roffset = ref-offset;
  if (!host_integerp (len, 0))
return false;
- if (TREE_CODE (dest) == ADDR_EXPR)
-   base = get_addr_base_and_unit_offset (TREE_OPERAND (dest,
0),
- offset);
- else if (TREE_CODE (dest) == SSA_NAME)
-   base = dest;
- if (base
-  base == ao_ref_base (ref))
+ ao_ref dref;
+ ao_ref_init_from_ptr_and_size (dref, dest, len);


What I dislike about this is that it can end up building a new tree node.
Not sure if that should block the patch though as it clearly allows to
simplify the code ...


+ tree base = ao_ref_base (dref);
+ HOST_WIDE_INT offset = dref.offset;
+ if (!base || dref.size == -1)
+   return false;
+ if (TREE_CODE (base) == MEM_REF)
+   {
+ if (TREE_CODE (rbase) != MEM_REF)


why's that?  I think that just processing both bases separately
would work as well.


If they differ, there is no point going on, we might as well break early. 
And this way we maintain the property that either base and rbase are both 
refs, or they are both pointers, not some weird mix.



+   return false;
+  

Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Ilya Enkovich
2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 2:31 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:59 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:00 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:24 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 1:52 PM, Ilya Enkovich 
 enkovich@gmail.com wrote:

 For input parameter P I need to have
   BOUNDS = __builtin_arg_bnd (P)
 to somehow refer to bounds of P in GIMPLE.  Optimizations may modify
 __builtin_arg_bnd (P) replacing P with its copy or some value. It
 makes call useless because removes information about parameter whose
 bounds we refer to. I want such optimization to ignore
 __builtin_arg_bnd calls and always leave default SSA_NAME of 
 PARM_DECL
 there as arg.

 How does a compilable testcase look like that shows how the default 
 def
 is used?  And what transforms break that use?  I really cannot see
 how this would happen (and you didn't add a testcase).

 Here is a test source:

 extern int bar1 (int *p);
 extern int bar2 (int);

 int foo (int *p)
 {
   if (!p)
 return bar1 (p);
   return bar2 (10);
 }

 After instrumentation GIMPLE looks like:

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 6:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (p_3(D));

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   _6 = bar1 (p_3(D), __bound_tmp.0_9);
   goto bb 5;

   bb 4:
   _8 = bar2 (10);

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Here is optimized GIMPLE (if I do not apply my changes in 
 tree-ssa-dom.c):

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (0B); [return slot 
 optimization]
   _6 = bar1 (0B, __bound_tmp.0_9); [tail call]
   goto bb 5;

   bb 4:
   _8 = bar2 (10); [tail call]

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Now during expand or inline of foo I cannot determine the value for
 __bound_tmp.0_9.

 Well, but clearly you can see that bounds for p == NULL are
 useless and thus there is no problem with the transform.

 This example just demonstrates the issue. I may use some var's address
 in comparison with the similar result. And value does not always allow
 to determine bounds, because pointers with the same value may have
 different bounds.

 Some vars address can be used for better bounds deriving though.
 Well, it clearly shows it's a get me good results issue and not
 a correctness issue.  And for good results you trade in missed
 general optimizations.  That doesn't sound very good to me.

 That is definitely a correctness issue.

 Then please add a runtime testcase that fails before and passes after
 your patch.

 Compiler cannot assume bounds
 by pointer value ignoring bounds passed to the function.

 But it can certainly drop bounds to unknown?  You certainly cannot
 rely on no such transform taking place.  Maybe this just shows that
 your refering to the parameter is done wrong and shouldn't have
 used the SSA name default def.  What do you do for parameters
 that have their address taken btw?  You'll get

 foo (void *p)
 {
p_2 = *p;
__builtin_ia32_arg_bnd (p_2);

 which isn't desired?

 In this case you just make a load and bounds for p_2 are obtained
 using bndldx call. I would be glad to use better way to refer to the
 argument but I do not see one.

 Err, my mistake - I meant

 foo (void *p)
 {
   p_2 = p;
   __builtin_ia32_arg_bnd (p_2);


 Hmm, unfortunately on x86_64 I get

 ./cc1 -quiet t.c -fcheck-pointer-bounds
 t.c:1:0: error: -fcheck-pointers is not supported for this target
  extern int bar1 (int *p);
  ^

 trying a simple testcase.  Grepping shows me no target supports
 chkp_bound_mode ..?  What's this stuff in our tree that has no way
 of testing it ... :/

 You should pass -mmpx to enable it on the target. Current version in
 the mpx tree should build most of tests correctly but it is not
 updated to the current state.

 ./cc1 -quiet t.c -fcheck-pointer-bounds -mmpx
 t.c:1:0: error: -fcheck-pointers is not supported for this target
  extern int bar1 (int *p);
  ^

 hmm, maybe it's configure disabled somehow (this is just my dev tree).
 But I still see no chkp_bound_mode implementation.

Sorry, I missed you use your dev tree. Only first 7 patches of ~30
have been committed so far and therefore no support yet. Currently
full implementation is available in mpx branch only.

Ilya


 Note that POINTER_BOUNDS_TYPE shouldn't have been a new
 tree code either but should be a RECORD_TYPE.  See how
 

Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Ilya Enkovich
2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 3:04 PM, Ilya Enkovich enkovich@gmail.com wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 2:31 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:59 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 6, 2013 at 12:00 PM, Ilya Enkovich enkovich@gmail.com 
 wrote:
 2013/11/6 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 3:24 PM, Ilya Enkovich 
 enkovich@gmail.com wrote:
 2013/11/5 Richard Biener richard.guent...@gmail.com:
 On Tue, Nov 5, 2013 at 1:52 PM, Ilya Enkovich 
 enkovich@gmail.com wrote:

 For input parameter P I need to have
   BOUNDS = __builtin_arg_bnd (P)
 to somehow refer to bounds of P in GIMPLE.  Optimizations may 
 modify
 __builtin_arg_bnd (P) replacing P with its copy or some value. It
 makes call useless because removes information about parameter 
 whose
 bounds we refer to. I want such optimization to ignore
 __builtin_arg_bnd calls and always leave default SSA_NAME of 
 PARM_DECL
 there as arg.

 How does a compilable testcase look like that shows how the default 
 def
 is used?  And what transforms break that use?  I really cannot see
 how this would happen (and you didn't add a testcase).

 Here is a test source:

 extern int bar1 (int *p);
 extern int bar2 (int);

 int foo (int *p)
 {
   if (!p)
 return bar1 (p);
   return bar2 (10);
 }

 After instrumentation GIMPLE looks like:

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 6:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (p_3(D));

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   _6 = bar1 (p_3(D), __bound_tmp.0_9);
   goto bb 5;

   bb 4:
   _8 = bar2 (10);

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Here is optimized GIMPLE (if I do not apply my changes in 
 tree-ssa-dom.c):

 foo (int * p)
 {
   unnamed type __bound_tmp.0;
   int _1;
   int _6;
   int _8;

   bb 2:
   if (p_3(D) == 0B)
 goto bb 3;
   else
 goto bb 4;

   bb 3:
   __bound_tmp.0_9 = __builtin_ia32_arg_bnd (0B); [return slot 
 optimization]
   _6 = bar1 (0B, __bound_tmp.0_9); [tail call]
   goto bb 5;

   bb 4:
   _8 = bar2 (10); [tail call]

   bb 5:
   # _1 = PHI _6(3), _8(4)
   return _1;

 }

 Now during expand or inline of foo I cannot determine the value for
 __bound_tmp.0_9.

 Well, but clearly you can see that bounds for p == NULL are
 useless and thus there is no problem with the transform.

 This example just demonstrates the issue. I may use some var's address
 in comparison with the similar result. And value does not always allow
 to determine bounds, because pointers with the same value may have
 different bounds.

 Some vars address can be used for better bounds deriving though.
 Well, it clearly shows it's a get me good results issue and not
 a correctness issue.  And for good results you trade in missed
 general optimizations.  That doesn't sound very good to me.

 That is definitely a correctness issue.

 Then please add a runtime testcase that fails before and passes after
 your patch.

 Compiler cannot assume bounds
 by pointer value ignoring bounds passed to the function.

 But it can certainly drop bounds to unknown?  You certainly cannot
 rely on no such transform taking place.  Maybe this just shows that
 your refering to the parameter is done wrong and shouldn't have
 used the SSA name default def.  What do you do for parameters
 that have their address taken btw?  You'll get

 foo (void *p)
 {
p_2 = *p;
__builtin_ia32_arg_bnd (p_2);

 which isn't desired?

 In this case you just make a load and bounds for p_2 are obtained
 using bndldx call. I would be glad to use better way to refer to the
 argument but I do not see one.

 Err, my mistake - I meant

 foo (void *p)
 {
   p_2 = p;
   __builtin_ia32_arg_bnd (p_2);

 Well, it does not make much difference. Such params are allocated on
 the stack by expand. So, here p_2 = p is load from memory and bounds
 should be loaded by bndldx. Expand is responsible to store bounds
 using bndstx when it allocates p on the stack.

 The parameter is incoming in a register still.

It does not matter. In GIMPLE it is still load. During expand
assign_parms allocates slot on the stack for it and stores reg to it.
DECL_RTL for PARM_DECL is set to stack slot.

I just added code to store corresponding bounds for stored param (I
did not send this patch yet).

Ilya



 Hmm, unfortunately on x86_64 I get

 ./cc1 -quiet t.c -fcheck-pointer-bounds
 t.c:1:0: error: -fcheck-pointers is not supported for this target
  extern int bar1 (int *p);
  ^

 trying a simple testcase.  Grepping shows me no target supports
 chkp_bound_mode ..?  What's this stuff in our tree that has no way
 of testing it ... :/

 You should pass -mmpx 

Re: [RFA][PATCH] Isolate erroneous paths optimization

2013-11-06 Thread Ian Lance Taylor
On Tue, Nov 5, 2013 at 10:50 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Tue, 5 Nov 2013, Ian Lance Taylor wrote:

 This patch actually breaks the Go testsuite.  In Go dereferencing a
 nil pointer is well-defined: it causes panic that can be caught.  This
 breaks a test for that functionality by changing the panic to a
 builtin_trap.

 That's not a big deal; I'll just disable this optimization in the Go
 frontend.


 Shouldn't go use -fno-delete-null-pointer-checks by default then? That
 should disable this optimization and others that rely on the same idea.

No, that is a different optimization with different properties.  The
-fdelete-null-pointer-checks optimization assumes that if you write
x = *p;
if (p == NULL) { printf (NULL\n); }
that the test p == NULL can not succeed.  In Go, that is true.  If p
is NULL the *p will cause a panic and ordinary code execution will not
proceed.

The recent -fisolate-erroneous-paths optimization will change code
like this:
if (p == NULL) { printf (NULL\n); }
x = *p;
into code like this:
if (p == NULL) { printf (NULL\n); __builtin_trap (); }
x = *p;
That is, it will insert a trap rather than dereferencing the pointer
known to be NULL.  This doesn't work for Go because we really do want
the panic, not the __builtin_trap.  This optimization would work fine
for Go if there were a way to explicitly call the panic function
rather than calling trap, but that would be a frontend-dependent
aspect to a middle-end optimization.

Ian


Re: [RFA][PATCH] Minor fix to aliasing machinery

2013-11-06 Thread Richard Biener
On Wed, Nov 6, 2013 at 3:08 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Wed, 6 Nov 2013, Richard Biener wrote:

 On Wed, Nov 6, 2013 at 1:19 PM, Marc Glisse marc.gli...@inria.fr wrote:

 [Discussion started in
 http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02472.html ]


 On Wed, 30 Oct 2013, Marc Glisse wrote:

 On Wed, 30 Oct 2013, Richard Biener wrote:

 Btw, get_addr_base_and_unit_offset may also return an offsetted
 MEM_REF (from MEM [p_3, 17] for example).  As we are interested in
 pointers this could be handled by not requiring a memory reference
 but extracting the base address and offset, covering more cases.



 I tried the attached patch, and it almost worked, except for one fortran
 testcase (widechar_intrinsics_10.f90):



 Now that ao_ref_init_from_ptr_and_size has been fixed, the following
 patch
 passes bootstrap+testsuite (default languages) on
 x86_64-unknown-linux-gnu.

 2013-11-06  Marc Glisse  marc.gli...@inria.fr
 Jeff Law  l...@redhat.com

 gcc/
 * tree-ssa-alias.c (stmt_kills_ref_p_1): Use
 ao_ref_init_from_ptr_and_size for builtins.

 gcc/testsuite/
 * gcc.dg/tree-ssa/alias-27.c: New testcase.

 --
 Marc Glisse
 Index: gcc/testsuite/gcc.dg/tree-ssa/alias-27.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/alias-27.c(revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/alias-27.c(working copy)
 @@ -0,0 +1,11 @@
 +/* { dg-do compile } */
 +/* { dg-options -O1 -fdump-tree-optimized } */
 +
 +void f (long *p) {
 +  *p = 42;
 +  p[4] = 42;
 +  __builtin_memset (p, 0, 100);
 +}
 +
 +/* { dg-final { scan-tree-dump-not = 42 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */

 Property changes on: gcc/testsuite/gcc.dg/tree-ssa/alias-27.c
 ___
 Added: svn:keywords
 ## -0,0 +1 ##
 +Author Date Id Revision URL
 \ No newline at end of property
 Added: svn:eol-style
 ## -0,0 +1 ##
 +native
 \ No newline at end of property
 Index: gcc/tree-ssa-alias.c
 ===
 --- gcc/tree-ssa-alias.c(revision 204453)
 +++ gcc/tree-ssa-alias.c(working copy)
 @@ -2001,23 +2001,24 @@ stmt_may_clobber_ref_p (gimple stmt, tre
return stmt_may_clobber_ref_p_1 (stmt, r);
  }

  /* If STMT kills the memory reference REF return true, otherwise
 return false.  */

  static bool
  stmt_kills_ref_p_1 (gimple stmt, ao_ref *ref)
  {
/* For a must-alias check we need to be able to constrain
 - the access properly.  */
 -  ao_ref_base (ref);
 -  if (ref-max_size == -1)
 + the access properly.
 + FIXME: except for BUILTIN_FREE.  */
 +  if (!ao_ref_base (ref)
 +  || ref-max_size == -1)
  return false;

if (gimple_has_lhs (stmt)
 TREE_CODE (gimple_get_lhs (stmt)) != SSA_NAME
/* The assignment is not necessarily carried out if it can throw
  and we can catch it in the current function where we could
 inspect
  the previous value.
  ???  We only need to care about the RHS throwing.  For aggregate
  assignments or similar calls and non-call exceptions the LHS
  might throw as well.  */
 @@ -2090,37 +2091,47 @@ stmt_kills_ref_p_1 (gimple stmt, ao_ref
   case BUILT_IN_MEMPCPY:
   case BUILT_IN_MEMMOVE:
   case BUILT_IN_MEMSET:
   case BUILT_IN_MEMCPY_CHK:
   case BUILT_IN_MEMPCPY_CHK:
   case BUILT_IN_MEMMOVE_CHK:
   case BUILT_IN_MEMSET_CHK:
 {
   tree dest = gimple_call_arg (stmt, 0);
   tree len = gimple_call_arg (stmt, 2);
 - tree base = NULL_TREE;
 - HOST_WIDE_INT offset = 0;
 + tree rbase = ref-base;
 + HOST_WIDE_INT roffset = ref-offset;
   if (!host_integerp (len, 0))
 return false;
 - if (TREE_CODE (dest) == ADDR_EXPR)
 -   base = get_addr_base_and_unit_offset (TREE_OPERAND (dest,
 0),
 - offset);
 - else if (TREE_CODE (dest) == SSA_NAME)
 -   base = dest;
 - if (base
 -  base == ao_ref_base (ref))
 + ao_ref dref;
 + ao_ref_init_from_ptr_and_size (dref, dest, len);


 What I dislike about this is that it can end up building a new tree node.
 Not sure if that should block the patch though as it clearly allows to
 simplify the code ...

 + tree base = ao_ref_base (dref);
 + HOST_WIDE_INT offset = dref.offset;
 + if (!base || dref.size == -1)
 +   return false;
 + if (TREE_CODE (base) == MEM_REF)
 +   {
 + if (TREE_CODE (rbase) != MEM_REF)


 why's that?  I think that just processing both bases separately
 would work as well.


 If they differ, there is no point going on, we might as 

RE: [PATCH, PR 57748] Check for out of bounds access, Part 3

2013-11-06 Thread Bernd Edlinger
 Hi,

 On Tue, 17 Sep 2013 01:09:45, Martin Jambor wrote:
 @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b
if (MEM_P (to_rtx)
 GET_MODE (to_rtx) == BLKmode
 GET_MODE (XEXP (to_rtx, 0)) != VOIDmode
 +   bitregion_start == 0
 +   bitregion_end == 0
 bitsize 0
 (bitpos % bitsize) == 0
 (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0

 ...

 I'm not sure to what extent the hunk adding tests for bitregion_start
 and bitregion_end being zero is connected to this issue. I do not see
 any of the testcases exercising that path. If it is indeed another
 problem, I think it should be submitted (and potentially committed) as
 a separate patch, preferably with a testcase.


 Meanwhile I am able to give an example where that code is executed
 with bitpos = 64, bitsize=32, bitregion_start = 32, bitregion_end = 95.

 Afterwards bitpos=0, bitsize=32, which is completely outside
 bitregion_start=32, bitregion_end=95.

 However this can only be seen in the debugger, as the store_field
 goes thru a code path that does not look at bitregion_start/end.

 Well that is at least extremely ugly, and I would not be sure, that
 I cannot come up with a sample that crashes or creates wrong code.

 Currently I think that maybe the best way to fix that would be this:

 --- gcc/expr.c2013-10-21 08:27:09.546035668 +0200
 +++ gcc/expr.c2013-10-22 15:19:56.749476525 +0200
 @@ -4762,6 +4762,9 @@ expand_assignment (tree to, tree from, b
 MEM_ALIGN (to_rtx) == GET_MODE_ALIGNMENT (mode1))
  {
to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT);
 +  bitregion_start = 0;
 +  if (bitregion_end= (unsigned HOST_WIDE_INT) bitpos)
 +bitregion_end -= bitpos;
bitpos = 0;
  }

 Any suggestions?



 Regards
 Bernd.


well, as already discussed, the following example executes the above memory 
block,
and leaves bitregion_start/end in an undefined state:

extern void abort (void);

struct x{
  int a;
  int :32;
  volatile int b:32;
};

struct s
{
  int a,b,c,d;
  struct x xx[1];
};

struct s ss;
volatile int k;

int main()
{
  ss.xx[k].b = 1;
//  asm volatile(:::memory);
  if ( ss.xx[k].b != 1)
    abort ();
  return 0;
}

Although this does not cause malfunction at the time, I'd propose to play safe,
and update the bitregion_start/bitregion_end. Additionally I'd propose to remove
this comment in expand_assignment and expand_expr_real_1:

 A constant address in TO_RTX can have VOIDmode, we must not try
  to call force_reg for that case.  Avoid that case.

This comment is completely out of sync: There is no longer any force_reg in 
that if-block,
and a constant address in TO_RTX has SImode or DImode in GET_MODE (XEXP 
(to_rtx, 0))
I do not know how to make it a VOIDmode, therefore the comment does not help 
either.


Boot-strapped and regression-tested on x86_64-linux-gnu.

OK for trunk?


Regards
Bernd.2013-11-06  Bernd Edlinger  bernd.edlin...@hotmail.de

PR middle-end/57748
* expr.c (expand_assignment): Remove bogus comment.
Update bitregion_start/bitregion_end.
(expand_expr_real_1): Remove bogus comment.



patch-pr57748-3.diff
Description: Binary data


Go patch committed: Turn off -fisolate-erroneous-paths for Go

2013-11-06 Thread Ian Lance Taylor
As discussed at the thread at
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00505.html , the new
-fisolate-erroneous-paths option does not work for Go.  The option
can change nil pointer deferences into a trap, but for Go we want those
to consistently be a panic.  This patch disables the option for Go.

If the option changes in a way that lets Go work, such as by disabling
the option when using -fnon-call-exceptions, then this patch can be
reverted.

Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian


2013-11-06  Ian Lance Taylor  i...@google.com

* go-lang.c (go_langhook_post_options): If
-fisolate-erroneous-paths was turned on by an optimization option,
turn it off.


Index: go-lang.c
===
--- go-lang.c	(revision 204430)
+++ go-lang.c	(working copy)
@@ -268,6 +268,12 @@ go_langhook_post_options (const char **p
   if (flag_excess_precision_cmdline == EXCESS_PRECISION_DEFAULT)
 flag_excess_precision_cmdline = EXCESS_PRECISION_STANDARD;
 
+  /* The isolate_erroneous_paths optimization can change a nil
+ dereference from a panic to a trap, so we have to disable it for
+ Go, even though it is normally enabled by -O2.  */
+  if (!global_options_set.x_flag_isolate_erroneous_paths)
+global_options.x_flag_isolate_erroneous_paths = 0;
+
   /* Returning false means that the backend should be used.  */
   return false;
 }


Re: [RFA][PATCH] Isolate erroneous paths optimization

2013-11-06 Thread Ian Lance Taylor
On Wed, Nov 6, 2013 at 7:15 AM, Richard Biener
richard.guent...@gmail.com wrote:

 But I think that you cannot transform

 foo ()
 {
   *0 = 1;
 }

 to __builtin_trap as you can catch the trap via an exception handler
 in a caller of foo, no?

That is true.  OK, I can see an argument that when using
-fnon-call-exceptions that kind of code should not be changed to call
__builtin_trap.

In that case I think it would be fine to run the isolate paths
optimization, but to not omit the actual dereference of the NULL
pointer (possibly the dereference could be followed by a trap).

Ian


Re: [RFA][PATCH] Isolate erroneous paths optimization

2013-11-06 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 04:23:06PM +0100, Richard Biener wrote:
  In that case I think it would be fine to run the isolate paths
  optimization, but to not omit the actual dereference of the NULL
  pointer (possibly the dereference could be followed by a trap).
 
 Yeah, we need the trap to properly end the BB (even if that is a
 waste instruction generated).

BTW, why do we generate in this optimization __builtin_trap rather than
just __builtin_unreachable ()?  The former still generates some code (abort,
some aborting instruction, ...), while the former is just an assertion that
valid code will not reach it.

Jakub


Re: [RFA][PATCH] Isolate erroneous paths optimization

2013-11-06 Thread Jeff Law

On 11/06/13 08:08, Ian Lance Taylor wrote:


The recent -fisolate-erroneous-paths optimization will change code
like this:
 if (p == NULL) { printf (NULL\n); }
 x = *p;
into code like this:
 if (p == NULL) { printf (NULL\n); __builtin_trap (); }
 x = *p;
That is, it will insert a trap rather than dereferencing the pointer
known to be NULL.  This doesn't work for Go because we really do want
the panic, not the __builtin_trap.  This optimization would work fine
for Go if there were a way to explicitly call the panic function
rather than calling trap, but that would be a frontend-dependent
aspect to a middle-end optimization.


But then you have -fnon-call-exceptions enabled?


Yes (go_langhook_init_options_struct in go/go-lang.c).
I wonder if we could have the front-ends define a builtin to use in this 
kind of situation and use the builtin trap as a fallback if the 
front-end didn't provide a suitable builtin.






Where obviously
-fisolate-erroneous-paths also shouldn't apply?


I guess that's not entirely obvious to me.  I'm not sure that
-fnon-call-exceptions means that all trapping instructions must be
executed.

In effect, yes.

Jeff


Re: [RFA][PATCH] Isolate erroneous paths optimization

2013-11-06 Thread Jeff Law

On 11/06/13 08:20, Ian Lance Taylor wrote:

On Wed, Nov 6, 2013 at 7:15 AM, Richard Biener
richard.guent...@gmail.com wrote:


But I think that you cannot transform

foo ()
{
   *0 = 1;
}

to __builtin_trap as you can catch the trap via an exception handler
in a caller of foo, no?


That is true.  OK, I can see an argument that when using
-fnon-call-exceptions that kind of code should not be changed to call
__builtin_trap.

In that case I think it would be fine to run the isolate paths
optimization, but to not omit the actual dereference of the NULL
pointer (possibly the dereference could be followed by a trap).
That would be trivial to implement.  I went back and forth on whether to 
to leave the *0 or issue the trap.  A program could catch the *0 and do 
something in its handler.  Changing the *0 to a trap would change hte 
program's behaviour if it took different action in the handler on the trap.


The trap works better in other cases that don't involve a dereference 
though.  For example, if a function has an argument that is marked as 
must be non-null and we find a path which passes null for that argument 
trapping seems best.


jeff


Re: [RFA][PATCH] Isolate erroneous paths optimization

2013-11-06 Thread Jeff Law

On 11/06/13 08:23, Richard Biener wrote:

On Wed, Nov 6, 2013 at 4:20 PM, Ian Lance Taylor i...@google.com wrote:

On Wed, Nov 6, 2013 at 7:15 AM, Richard Biener
richard.guent...@gmail.com wrote:


But I think that you cannot transform

foo ()
{
   *0 = 1;
}

to __builtin_trap as you can catch the trap via an exception handler
in a caller of foo, no?


That is true.  OK, I can see an argument that when using
-fnon-call-exceptions that kind of code should not be changed to call
__builtin_trap.

In that case I think it would be fine to run the isolate paths
optimization, but to not omit the actual dereference of the NULL
pointer (possibly the dereference could be followed by a trap).


Yeah, we need the trap to properly end the BB (even if that is a
waste instruction generated).
Right.  We *really* don't want execution to continue.  We've entered an 
undefined state and continuing execution can lead to all kinds of nasty 
problems, including security exploits.


jeff


Re: [RFA][PATCH] Isolate erroneous paths optimization

2013-11-06 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 09:15:57AM -0700, Jeff Law wrote:
 On 11/06/13 08:27, Jakub Jelinek wrote:
 On Wed, Nov 06, 2013 at 04:23:06PM +0100, Richard Biener wrote:
 In that case I think it would be fine to run the isolate paths
 optimization, but to not omit the actual dereference of the NULL
 pointer (possibly the dereference could be followed by a trap).
 
 Yeah, we need the trap to properly end the BB (even if that is a
 waste instruction generated).
 
 BTW, why do we generate in this optimization __builtin_trap rather than
 just __builtin_unreachable ()?  The former still generates some code (abort,
 some aborting instruction, ...), while the former is just an assertion that
 valid code will not reach it.
 Because if you do reach the site, you really want to halt the
 program to avoid potential security exploits.  I'm actually of the
 opinion that builtin_unreachable should be trapping as well.

__builtin_unreachable () is trapping if -fsanitize=unreachable, but
otherwise it is just an optimization hint, intentionally so, that allows
optimizing on the fact that it doesn't happen.  If from fear of security
exploits we'd stop trying to optimize code well, we'd need to punt on using
undefined integer overflow and many other things.

Jakub


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread David Malcolm
On Wed, 2013-11-06 at 14:11 +0100, Jakub Jelinek wrote:
 On Wed, Nov 06, 2013 at 02:00:56PM +0100, Richard Biener wrote:
  Well, what else, besides as_a or keeping the current
  global accessor functions would you propose?
 
 I'd prefer to keep the current accessors.  Whether there is a class
 hierarchy or we keep unions doesn't affect the code too much (unless
 it makes debugging experience harder, as numerous past C++ification changes
 have done :( ).

What specific things have become more difficult to debug due to the
changes you refer to?
I can have a look at improving them by extending gdbhooks.py, if that is
desirable.

Thanks
Dave



[PATCH, PR 10474] Take two on splitting live-ranges of function arguments to help shrink-wrapping

2013-11-06 Thread Martin Jambor
Hi,

last Thursday I had to revert a previous version of this patch because
it has caused a lot of trouble on various platforms I did not test it
on.  The patch is still very similar to its previous iteration
(http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02183.html) the only
major difference is that I moved more after-the-fact re-analyzing from
find_moveable_pseudos to a point when my transformation also finished.

The problem with it was that REG_N_REFS of the pseudos introduced by
my code was not set and thus reload ignored and let it slip through
instead of generating spills.  This is fixed by moving the
re-computation of regstat_n_sets_and_refs in
regstat_init_n_sets_and_refs to after my transformation.  In order to
avoid similar surprises I have moved all other re-computations from
find_moveable_pseudos to the caller in ira, namely:

  fix_reg_equiv_init ();
  expand_reg_info ();
  regstat_free_n_sets_and_refs ();
  regstat_free_ri ();
  regstat_init_n_sets_and_refs ();
  regstat_compute_ri ();

I have also bootstrapped and tested the patch not only on x86_64-linux
but also on i686-linux and ppc64-linux (without Ada though).  I have
made sure that the reported problem does not occur on cris-elf and
sh-none-elf cross compilers.  (I could not reproduce it on arm, and do
not have access to sparc but it was also reported there.)

Another minor change which I erroneously omitted the last time is that
the testcases are run on x86_64 only because that is the only platform
where I know the transformation currently takes place.  The reason why
I did not move them to a target-specific directory is that I believe
the transformation can be beneficial on other platforms as well.  For
example, PR 10474 was actually filed against PPC but this patch does
not work there because the initial move of the parameter is done in a
parallel insn:

(parallel [
(set (reg:CC 124)
(compare:CC (reg:DI 3 3 [ i ])
(const_int 0 [0])))
(set (reg/v/f:DI 123 [ i ])
(reg:DI 3 3 [ i ]))
])

which fails my single_set test.  However, the mechanism can be
extended to handle these situations as well and afterwards we could
run the test also on ppc64.

So, Vlad, Steven, do you think that this time I have re-computed all
that is necessary?  Do you think the patch is OK?

Thanks a lot and sorry for the breakage,

Martin


2013-11-04  Martin Jambor  mjam...@suse.cz

PR rtl-optimization/10474
* ira.c (interesting_dest_for_shprep): New function.
(split_live_ranges_for_shrink_wrap): Likewise.
(find_moveable_pseudos): Move calculation of dominance info,
df_analysios and the final anlyses to...
(ira): ...here, call split_live_ranges_for_shrink_wrap.

testsuite/
* gcc.dg/pr10474.c: New testcase.
* gcc.dg/ira-shrinkwrap-prep-1.c: Likewise.
* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.

diff --git a/gcc/ira.c b/gcc/ira.c
index 9e94704..70fe13b 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -4514,9 +4514,6 @@ find_moveable_pseudos (void)
   pseudo_replaced_reg.release ();
   pseudo_replaced_reg.safe_grow_cleared (max_regs);
 
-  df_analyze ();
-  calculate_dominance_info (CDI_DOMINATORS);
-
   i = 0;
   bitmap_initialize (live, 0);
   bitmap_initialize (used, 0);
@@ -4829,14 +4826,196 @@ find_moveable_pseudos (void)
   free (bb_moveable_reg_sets);
 
   last_moveable_pseudo = max_reg_num ();
+}
 
-  fix_reg_equiv_init ();
-  expand_reg_info ();
-  regstat_free_n_sets_and_refs ();
-  regstat_free_ri ();
-  regstat_init_n_sets_and_refs ();
-  regstat_compute_ri ();
-  free_dominance_info (CDI_DOMINATORS);
+
+/* If insn is interesting for parameter range-splitting shring-wrapping
+   preparation, i.e. it is a single set from a hard register to a pseudo, which
+   is live at CALL_DOM, return the destination.  Otherwise return NULL.  */
+
+static rtx
+interesting_dest_for_shprep (rtx insn, basic_block call_dom)
+{
+  rtx set = single_set (insn);
+  if (!set)
+return NULL;
+  rtx src = SET_SRC (set);
+  rtx dest = SET_DEST (set);
+  if (!REG_P (src) || !HARD_REGISTER_P (src)
+  || !REG_P (dest) || HARD_REGISTER_P (dest)
+  || (call_dom  !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest
+return NULL;
+  return dest;
+}
+
+/* Split live ranges of pseudos that are loaded from hard registers in the
+   first BB in a BB that dominates all non-sibling call if such a BB can be
+   found and is not in a loop.  Return true if the function has made any
+   changes.  */
+
+static bool
+split_live_ranges_for_shrink_wrap (void)
+{
+  basic_block bb, call_dom = NULL;
+  basic_block first = single_succ (ENTRY_BLOCK_PTR);
+  rtx insn, last_interesting_insn = NULL;
+  bitmap_head need_new, reachable;
+  vecbasic_block queue;
+
+  if (!flag_shrink_wrap)
+return false;
+
+  bitmap_initialize (need_new, 0);
+  bitmap_initialize (reachable, 0);
+  queue.create 

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-06 Thread Bill Schmidt
On Wed, 2013-11-06 at 15:01 +0100, Richard Biener wrote:
 On Wed, Nov 6, 2013 at 2:24 PM, Tejas Belagod tbela...@arm.com wrote:
 
  Hi,
 
  The attached patch eliminates moves of the form
 
  set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]
 
  i.e. eliminates lower lane moves between src and dst where src and dst are
  the same register and this causes rtl to instead use the destination
  register in the required mode.
 
  Also, if my understanding of Big-Endian is correct, this should be safe for
  big-endian targets as well.
 
  I've bootstrapped this on x64_64, regressed on aarch64-none-elf,
  aarch64_be-none-elf.
 
  OK for trunk?
 
 It looks good to me (but I'm also wondering about bigendian).  Bill?

Yes, that should be ok.  I can run a regression test to confirm.

Thanks,
Bill

 
 Can you add a testcase where this has an effect on code generation?
 
 Thanks,
 Richard.
 
  Thanks,
  Tejas Belagod
  ARM.
 
  2013-11-06  Tejas Belagod  tejas.bela...@arm.com
 
  gcc/
  * rtlanal.c (set_noop_p): Return nonzero in case of redundant
  vec_select
  for same src and dst.
  diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
  index 9769b69..3e434cd 100644
  --- a/gcc/rtlanal.c
  +++ b/gcc/rtlanal.c
  @@ -1180,6 +1180,25 @@ set_noop_p (const_rtx set)
 dst = SUBREG_REG (dst);
   }
 
  +  /* This is big-endian-safe because the elements are kept in target
  + memory order.  So, for eg. PARALLEL element value of 2 is the same in
  + either endian-ness.  */
  +  if (GET_CODE (src) == VEC_SELECT
  +   REG_P (XEXP (src, 0))  REG_P (dst)
  +   REGNO (XEXP (src, 0)) == REGNO (dst))
  +{
  +  rtx par = XEXP (src, 1);
  +  int i;
  +
  +  for (i = 0; i  XVECLEN (par, 0); i++)
  +   {
  + rtx tem = XVECEXP (par, 0, i);
  + if (!CONST_INT_P (tem) || INTVAL (tem) != i)
  +   return 0;
  +   }
  +  return 1;
  +}
  +
 return (REG_P (src)  REG_P (dst)
 REGNO (src) == REGNO (dst));
   }
 



Re: [PATCH] Add gimple subclasses for every gimple code (was Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3))

2013-11-06 Thread Michael Matz
Hi,

On Tue, 5 Nov 2013, David Malcolm wrote:

 Here's a followup patch which ensures that every gimple code has its own
 subclass, by adding empty subclasses derived from the GSS_-based
 subclasses as appropriate (I don't bother for gimple codes that already
 have their own subclass due to having their own GSS layout).  I also
 copied the comments from gimple.def into gimple.h, so that Doxygen picks
 up on the descriptions and uses them to describe each subclass.

I don't like that.  The empty classes are just useless, they imply a 
structure that isn't really there, some of the separate gimple codes are 
basically selectors of specific subtypes of a generic concept, without 
additional data or methods; creating a type for those is confusing.

Generally I don't like complicating the type system without good reasons 
(as in actually also making use of the complicated types).  The fewer 
types the better IMO.


Ciao,
Michael.


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 11:25:31AM -0500, David Malcolm wrote:
 On Wed, 2013-11-06 at 14:11 +0100, Jakub Jelinek wrote:
  On Wed, Nov 06, 2013 at 02:00:56PM +0100, Richard Biener wrote:
   Well, what else, besides as_a or keeping the current
   global accessor functions would you propose?
  
  I'd prefer to keep the current accessors.  Whether there is a class
  hierarchy or we keep unions doesn't affect the code too much (unless
  it makes debugging experience harder, as numerous past C++ification changes
  have done :( ).
 
 What specific things have become more difficult to debug due to the
 changes you refer to?
 I can have a look at improving them by extending gdbhooks.py, if that is
 desirable.

I'm refering to various things that have been mentioned already multiple
times, mostly from the switch to C++, like:
1) p debug_tree (0x718b3690)
   not working anymore, typically one would use this when just cut'n'pasting
   addresses seen in backtrace, other debug_tree etc., one now has to add
   explicitly a cast p debug_tree ((tree)0x718b3690).
   Perhaps this can be cured just by adding extra overloads for a few most
   popular debug_* routines that weren't initially overloaded, add them an
   overload with uintptr_t type or similar that will just cast it to the
   right type and call the original routine
2) tab completion often not working well for hundreds of functions, say type
   p gimple_can_mtab
   When gcc was written in C, this would complete it just fine into
   p gimple_can_merge_blocks_p
   but GDB is confused by the forward declarations and offers you:
   p gimple_can_merge_blocks_p(basic_block
   and doesn't autocomplete further from there, so, for functions that
   aren't really overloaded (fortunately so far most of them) one has to
   manually remove all the cruft until and including the (, or has to
   complete from memory or similar.  The issue is that gdb sees:
   gimple_can_merge_blocks_p(basic_block, basic_block)
   gimple_can_merge_blocks_p(basic_block_def*, basic_block_def*)
   gimple_can_merge_blocks_p(basic_block_def*, basic_block_def*)::__FUNCTION__
   and considers the two different thing, but of course:
   typedef struct basic_block_def *basic_block;
   and thus you really don't care which one is chosen from these, but
   preferrably just one, not both.
3) step got significantly worse over time, we have skip_file tree.h, but
   with all the recent header reorganizations gdbinit.in hasn't been
   updated on which other headers would greatly benefit from being skipped
   by default.  Certainly gimple.h IMHO, and various new headers that just
   contain small inline accessors
4) the gdbhooks.py printing is nice, but (talking just from memory), it is
   tied to just one spelling of various types, so say if something has
   basic_block type, it is printed nicely, but if it has basic_block_def *
   type, it isn't
5) the new improved C++ vec.h is harder to use in the debugger
This list is still very much incomplete.

Jakub


Disable floating-point contraction in ISO C conformance modes

2013-11-06 Thread Joseph S. Myers
In http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01131.html, I noted
that, as I'd previously noted in
http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01951.html, ISO C
conformance modes should set -ffp-contract=on (= -ffp-contract=off)
rather than the default -ffp-contract=fast, given the ISO C
restriction to contraction taking place within source language
expressions.  No-one indicated any reason not to do this, so I've now
applied this patch to do so (and to make an explicit
-ffp-contract=fast in ISO C mode imply __GCC_IEC_559=0).  Bootstrapped
with no regressions on x86_64-unknown-linux-gnu.

c-family:
2013-11-06  Joseph Myers  jos...@codesourcery.com

* c-opts.c (c_common_post_options): Set -ffp-contract=off in C
standards modes.
* c-cppbuiltin.c (cpp_iec_559_value): Consider -ffp-contract=fast
to mean lack of IEEE 754 support.

testsuite:
2013-11-06  Joseph Myers  jos...@codesourcery.com

* gcc.dg/torture/c99-contract-1.c: New test.

Index: gcc/c-family/c-cppbuiltin.c
===
--- gcc/c-family/c-cppbuiltin.c (revision 204453)
+++ gcc/c-family/c-cppbuiltin.c (working copy)
@@ -726,16 +726,19 @@ cpp_iec_559_value (void)
 ret = 0;
 
   /* In strict C standards conformance mode, consider unpredictable
- excess precision to mean lack of IEEE 754 support.  ??? The same
- should apply to unpredictable contraction, but at present
- standards conformance options do not enable conforming
- contraction.  For C++, and outside strict conformance mode, do
- not consider these options to mean lack of IEEE 754 support.  */
+ excess precision to mean lack of IEEE 754 support.  The same
+ applies to unpredictable contraction.  For C++, and outside
+ strict conformance mode, do not consider these options to mean
+ lack of IEEE 754 support.  */
   if (flag_iso
!c_dialect_cxx ()
TARGET_FLT_EVAL_METHOD != 0
flag_excess_precision_cmdline != EXCESS_PRECISION_STANDARD)
 ret = 0;
+  if (flag_iso
+   !c_dialect_cxx ()
+   flag_fp_contract_mode == FP_CONTRACT_FAST)
+ret = 0;
 
   /* Various options are contrary to IEEE 754 semantics.  */
   if (flag_unsafe_math_optimizations
Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c   (revision 204453)
+++ gcc/c-family/c-opts.c   (working copy)
@@ -827,6 +827,15 @@ c_common_post_options (const char **pfilename)
 ? EXCESS_PRECISION_STANDARD
 : EXCESS_PRECISION_FAST);
 
+  /* ISO C restricts floating-point expression contraction to within
+ source-language expressions (-ffp-contract=on, currently an alias
+ for -ffp-contract=off).  */
+  if (flag_iso
+   !c_dialect_cxx ()
+   (global_options_set.x_flag_fp_contract_mode
+ == (enum fp_contract_mode) 0))
+flag_fp_contract_mode = FP_CONTRACT_OFF;
+
   /* By default we use C99 inline semantics in GNU99 or C99 mode.  C99
  inline semantics are not supported in GNU89 or C89 mode.  */
   if (flag_gnu89_inline == -1)
Index: gcc/testsuite/gcc.dg/torture/c99-contract-1.c
===
--- gcc/testsuite/gcc.dg/torture/c99-contract-1.c   (revision 0)
+++ gcc/testsuite/gcc.dg/torture/c99-contract-1.c   (revision 0)
@@ -0,0 +1,21 @@
+/* Test floating-point contraction occurs only within source language
+   expressions.  */
+/* { dg-do run } */
+/* { dg-options -std=c99 -pedantic-errors } */
+
+extern void abort (void);
+extern void exit (int);
+
+volatile float a = 1 + 0x1p-23f, b = 1 - 0x1p-23f, c = -1;
+
+int
+main (void)
+{
+  float av = a, bv = b, cv = c;
+  float p = av * bv;
+  float r = p + cv;
+  if (r == 0)
+exit (0);
+  else
+abort ();
+}

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [RFA][PATCH] Isolate erroneous paths optimization

2013-11-06 Thread Jeff Law

On 11/06/13 08:27, Jakub Jelinek wrote:

On Wed, Nov 06, 2013 at 04:23:06PM +0100, Richard Biener wrote:

In that case I think it would be fine to run the isolate paths
optimization, but to not omit the actual dereference of the NULL
pointer (possibly the dereference could be followed by a trap).


Yeah, we need the trap to properly end the BB (even if that is a
waste instruction generated).


BTW, why do we generate in this optimization __builtin_trap rather than
just __builtin_unreachable ()?  The former still generates some code (abort,
some aborting instruction, ...), while the former is just an assertion that
valid code will not reach it.
Because if you do reach the site, you really want to halt the program to 
avoid potential security exploits.  I'm actually of the opinion that 
builtin_unreachable should be trapping as well.



jeff


Re: patch implementing a new pass for register-pressure relief through live range shrinkage

2013-11-06 Thread Vladimir Makarov

On 11/6/2013, 4:17 AM, Richard Biener wrote:

On Tue, Nov 5, 2013 at 4:35 PM, Vladimir Makarov vmaka...@redhat.com wrote:

   I'd like to add a new experimental optimization to the trunk.  This
optimization was discussed on RA BOF of this summer GNU Cauldron.

   It is a register pressure relief through live-range shrinkage.  It
is implemented on the scheduler base and uses register-pressure insn
scheduling infrastructure.  By rearranging insns we shorten pseudo
live-ranges and increase a chance to them be assigned to a hard
register.

   The code looks pretty simple but there are a lot of works behind
this patch.  I've tried about ten different versions of this code
(different heuristics for two currently existing register-pressure
algorithms).

   I think it is *upto target maintainers* to decide to use or not to
use this optimization for their targets.  I'd recommend to use this at
least for x86/x86-64.  I think any OOO processor with small or
moderate register file which does not use the 1st insn scheduling
might benefit from this too.

   On SPEC2000 for x86/x86-64 (I use Haswell processor, -O3 with
general tuning), the optimization usage results in smaller code size
in average (for floating point and integer benchmarks in 32- and
64-bit mode).  The improvement better visible for SPECFP2000 (although
I have the same improvement on x86-64 SPECInt2000 but it might be
attributed mostly mcf benchmark unstability).  It is about 0.5% for
32-bit and 64-bit mode.  It is understandable, as the optimization has
more opportunities to improve the code on longer BBs.  Different from
other heuristic optimizations, I don't see any significant worse
performance.  It gives practically the same or better performance (a
few benchmarks imporoved by 1% or more upto 3%).

   The single but significant drawback is additional compilation time
(4%-6%) as the 1st insn scheduling pass is quite expensive.  So I'd
recommend target maintainers to switch it on only for -Ofast.

Generally I'd not recomment viewing -Ofast as -O4 but as -O3
plus generally unsafe optimizations.  So I'd not enable it for -Ofast
but for -O3 - possibly also with -Os if indeed the main motivation is
also code-size improvements (-Os is a similar beast as -O3, spend
as much time as you can on optimizing size).
Ok.  Probably my recommendation is wrong.  It is actually upto target 
maintainers to decide when to use the optimization and or use it at all 
for default (may be they just decide to use it only for SPEC reporting).


I guess that in some time we will need to use something like -O4 for 
greedy algorithms (there are a lot of researches in this area, e.g. I am 
reading an article about optimal register-pressure sensitive insn 
scheduling but the optimization can be constrained for time, for example 
1ms for each insn, and still to produce better results than the current 
heuristics).  I am sure such algorithms will be coming.



Btw, thanks for working on this.  How does it relate to
-fsched-pressure?
It is based on -fsched-pressure infrastructure but has different 
heuristics and goals.  GCC with 1st insn scheduling even with 
-fsched-pressure still produces worse results on mainstream x86/x86-64 
processors that  GCC without it.  I've also tried -flive-range-shrinkage 
-fschedule-insns -fsched-pressure, but just -flive-range-shrinkage is 
better for x86/x86-64.


By the way, LLVM uses insn-scheduling for x86/x86-64 before RA, but it 
goal is only register-pressure decrease (for x86, for x86-64 it is a bit 
more complicated).  So with this optimization we are just catching up 
with LLVM (which is unusual for us in optimization area).

   Does it treat all register classes the same?
On x86 mostly the few fixed registers for some of the integer pipeline
instructions hurt, x86_64 has enough general and FP registers?
It treats them the same (although it is different for different classes 
as they have different number of available regs).  It is always some 
kind of approximation as we use register pressure classes here not the 
classes which will be actually used for RA.  It is even more complicated 
as IRA actually uses dynamic classes (only sets of regs which are 
profitable, e.g. it can be different from classes defined in the target 
file as reg in classes are caller-saved or some specific hard regs are 
used for arg passing). It makes graph coloring better for irregular 
register file architectures.  In whole as I remember, dynamic classes 
gave about 1% improvement even for ppc.


I should say that presence of hard regs in RTL (e.g. for parameter 
passing) is still a challenge for live-range shrinkage and 
register-pressure scheduling.  It should be addressed somehow.




Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-06 Thread Bill Schmidt


On Wed, 2013-11-06 at 10:42 -0600, Bill Schmidt wrote:
 On Wed, 2013-11-06 at 15:01 +0100, Richard Biener wrote:
  On Wed, Nov 6, 2013 at 2:24 PM, Tejas Belagod tbela...@arm.com wrote:
  
   Hi,
  
   The attached patch eliminates moves of the form
  
   set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 
   0]
  
   i.e. eliminates lower lane moves between src and dst where src and dst are
   the same register and this causes rtl to instead use the destination
   register in the required mode.
  
   Also, if my understanding of Big-Endian is correct, this should be safe 
   for
   big-endian targets as well.
  
   I've bootstrapped this on x64_64, regressed on aarch64-none-elf,
   aarch64_be-none-elf.
  
   OK for trunk?
  
  It looks good to me (but I'm also wondering about bigendian).  Bill?
 
 Yes, that should be ok.  I can run a regression test to confirm.

Never mind, I agree with Richard S.  The direction of the ordering is
fine, but an offset should be needed.

Bill

 
 Thanks,
 Bill
 
  
  Can you add a testcase where this has an effect on code generation?
  
  Thanks,
  Richard.
  
   Thanks,
   Tejas Belagod
   ARM.
  
   2013-11-06  Tejas Belagod  tejas.bela...@arm.com
  
   gcc/
   * rtlanal.c (set_noop_p): Return nonzero in case of redundant
   vec_select
   for same src and dst.
   diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
   index 9769b69..3e434cd 100644
   --- a/gcc/rtlanal.c
   +++ b/gcc/rtlanal.c
   @@ -1180,6 +1180,25 @@ set_noop_p (const_rtx set)
  dst = SUBREG_REG (dst);
}
  
   +  /* This is big-endian-safe because the elements are kept in target
   + memory order.  So, for eg. PARALLEL element value of 2 is the same 
   in
   + either endian-ness.  */
   +  if (GET_CODE (src) == VEC_SELECT
   +   REG_P (XEXP (src, 0))  REG_P (dst)
   +   REGNO (XEXP (src, 0)) == REGNO (dst))
   +{
   +  rtx par = XEXP (src, 1);
   +  int i;
   +
   +  for (i = 0; i  XVECLEN (par, 0); i++)
   +   {
   + rtx tem = XVECEXP (par, 0, i);
   + if (!CONST_INT_P (tem) || INTVAL (tem) != i)
   +   return 0;
   +   }
   +  return 1;
   +}
   +
  return (REG_P (src)  REG_P (dst)
  REGNO (src) == REGNO (dst));
}
  



Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-06 Thread Tejas Belagod

Richard Sandiford wrote:

Tejas Belagod tbela...@arm.com writes:

Richard Sandiford wrote:

Tejas Belagod tbela...@arm.com writes:

+  /* This is big-endian-safe because the elements are kept in target
+ memory order.  So, for eg. PARALLEL element value of 2 is the same in
+ either endian-ness.  */
+  if (GET_CODE (src) == VEC_SELECT
+   REG_P (XEXP (src, 0))  REG_P (dst)
+   REGNO (XEXP (src, 0)) == REGNO (dst))
+{
+  rtx par = XEXP (src, 1);
+  int i;
+
+  for (i = 0; i  XVECLEN (par, 0); i++)
+   {
+ rtx tem = XVECEXP (par, 0, i);
+ if (!CONST_INT_P (tem) || INTVAL (tem) != i)
+   return 0;
+   }
+  return 1;
+}
+

I think for big endian it needs to be:

INTVAL (tem) != i + base

where base is something like:

int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 0);

E.g. a big-endian V4HI looks like:

msb  lsb


and shortening it to say V2HI only gives the low 32 bits:

msb  lsb


But, in this case we want

 msb  lsb
 


It depends on whether the result occupies a full register or not.
I was thinking of the case where it didn't, but I realise now you were
thinking of the case where it did.  And yeah, my suggestion doesn't
cope with that...

I was under the impression that the const vector parallel for vec_select 
represents the element indexes of the array in memory order.


Therefore, in bigendian,

  msb lsb
     
element  a[0] a[1] a[2] a[3]

and in littleendian

  msb lsb
     
element  a[3] a[2] a[1] a[0]


Right.  But if an N-bit value is stored in a register, it's assumed to
occupy the lsb of the register and the N-1 bits above that.  The other
bits in the register are don't-care.

E.g., leaving vectors to one side, if you have:

   (set (reg:HI N) (truncate:SI (reg:SI N)))

on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:

   msb  lsb
   01234567
   
   4567

rather than:

   msb  lsb
   01234567
   
   0123

for both endiannesses.  The same principle applies to vectors.
The lsb of the register is always assumed to be significant.

So maybe the original patch was correct for partial-register and
full-register results on little-endian, but only for full-register
results on big-endian.


Ah, ok! I think I get it. By eliminating
set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

using the check INTVAL (tem) != i, I'm essentially making subsequent operations 
use (reg:V2DI n) in DI mode which is a partial register result and this gives me 
the wrong set of lanes in bigendian. So, if I want to use (reg n) in partial 
register mode, I have to make sure the correct elements coincide with the lsb in 
big-endian...


Thanks for your input, I'll apply the offset correction for big-endian you 
suggested. I'll respin the patch.


Thanks,
Tejas.



Re: [PATCH]Fix computation of offset in ivopt

2013-11-06 Thread Richard Sandiford
Hi,

bin.cheng bin.ch...@arm.com writes:
 Index: gcc/tree-ssa-loop-ivopts.c
 ===
 --- gcc/tree-ssa-loop-ivopts.c(revision 203267)
 +++ gcc/tree-ssa-loop-ivopts.c(working copy)
 @@ -2037,12 +2037,12 @@ find_interesting_uses (struct ivopts_data *data)
  
  static tree
  strip_offset_1 (tree expr, bool inside_addr, bool top_compref,
 - unsigned HOST_WIDE_INT *offset)
 + HOST_WIDE_INT *offset)
  {
tree op0 = NULL_TREE, op1 = NULL_TREE, tmp, step;
enum tree_code code;
tree type, orig_type = TREE_TYPE (expr);
 -  unsigned HOST_WIDE_INT off0, off1, st;
 +  HOST_WIDE_INT off0, off1, st;
tree orig_expr = expr;
  
STRIP_NOPS (expr);
 @@ -2133,19 +2133,32 @@ strip_offset_1 (tree expr, bool inside_addr, bool
break;
  
  case COMPONENT_REF:
 -  if (!inside_addr)
 - return orig_expr;
 +  {
 + tree field;
  
 -  tmp = component_ref_field_offset (expr);
 -  if (top_compref
 -cst_and_fits_in_hwi (tmp))
 - {
 -   /* Strip the component reference completely.  */
 -   op0 = TREE_OPERAND (expr, 0);
 -   op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
 -   *offset = off0 + int_cst_value (tmp);
 -   return op0;
 - }
 + if (!inside_addr)
 +   return orig_expr;
 +
 + tmp = component_ref_field_offset (expr);
 + field = TREE_OPERAND (expr, 1);
 + if (top_compref
 +  cst_and_fits_in_hwi (tmp)
 +  cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))

While comparing output for wide-int and mainline, I noticed that
this condition is now always false on x86_64, since DECL_FIELD_BIT_OFFSET
is a 128-bit bitsizetype and since cst_and_fits_in_hwi rejects constants
with precision greater than HWI:

  if (TREE_CODE (x) != INTEGER_CST)
return false;

  if (TYPE_PRECISION (TREE_TYPE (x))  HOST_BITS_PER_WIDE_INT)
return false;

Should this be host_integerp (DECL_FIELD_BIT_OFFSET (field), 0) instead?

 +   {
 + HOST_WIDE_INT boffset, abs_off;
 +
 + /* Strip the component reference completely.  */
 + op0 = TREE_OPERAND (expr, 0);
 + op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
 + boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
 + abs_off = abs_hwi (boffset) / BITS_PER_UNIT;
 + if (boffset  0)
 +   abs_off = -abs_off;
 +
 + *offset = off0 + int_cst_value (tmp) + abs_off;
 + return op0;
 +   }
 +  }
break;

Thanks,
Richard


Re: Re-factor tree.h - Part 1

2013-11-06 Thread Diego Novillo
On Wed, Nov 6, 2013 at 2:04 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Tue, 5 Nov 2013, Diego Novillo wrote:

 This is the first patch in a series of patches to cleanup tree.h to
 reduce the exposure it has all over the compiler.

 In this patch, I'm moving functions that are used once into the files
 that use them, and make them private to that file. These functions
 were declared extern in tree.h and called from exactly one place.


 I am not a big fan of doing it so automatically. For instance
 widest_int_cst_value should imho remain next to int_cst_value since they
 mostly share the same implementation. Doing this also doesn't promote code
 reuse: if I am looking for a function that does some basic operation on
 trees, I won't only need to look in the file that is semantically relevant,

Yeah, that was what I was trying to impose. Functions that
semantically make better sense in the place where they're called from.
I filtered functions that were used once but would not make much sense
to move (for instance, the init functions).

How about this. Below is the list of functions that I took out of
tree.h.  They are either not used or used only once, in which case I
moved them (and their private transitive closure) over to the file
that uses them.  There are more, but these are the ones that I
originally considered to be too specific to the user file to be
globally exposed.  Which ones would you folks keep global?

I have marked some that we may decide to keep extern and one, which I
would not mind to not move at all.  I also marked the functions I
removed and the ones that I just made private to their definition
file:

Moved to builtins.c:
more_const_call_expr_args_p
expand_stack_restore
expand_stack_save

Moved to cfgexpand.c:
expand_main_function
stack_protect_prologue
expand_asm_stmt
expand_computed_goto
expand_goto
expand_return

Moved to cgraphclones.c:
build_function_decl_skip_args

Moved to explow.c:
tree_expr_size (maybe keep extern?)

Moved to expr.c:
fields_length

Moved to fold-const.c:
size_low_cst

Moved to gimple-fold.c:
truth_type_for (maybe keep extern?)

Moved to symtab.c:
decl_assembler_name_hash
decl_assembler_name_equal

Moved to trans-mem.c:
is_tm_safe_or_pure

Moved to tree-eh.c:
in_array_bounds_p
range_in_array_bounds_p

Moved to tree-ssa-dom.c:
iterative_hash_exprs_commutative

Moved to tree-ssa-math-opts.c:
widest_int_cst_value (maybe keep extern?)

Moved to tree-vrp.c:
fold_unary_to_constant (?)

Moved to cp/call.c
ctor_to_vec

Moved to cp/decl.c:
supports_one_only
chain_member

Moved to java/class.c:
build_method_type

Removed (not used anywhere)
build_type_no_quals
omp_remove_redundant_declare_simd_attrs
fold_build3_initializer_loc
real_twop
print_vec_tree
list_equal_p
ssa_name_nonnegative_p
addr_expr_of_non_mem_decl_p
save_vtable_map_decl

Made static (only used in its defining file)
stabilize_reference_1
tree_expr_nonzero_p
tree_invalid_nonnegative_warnv_p
tree_expr_nonzero_warnv_p
fold_builtin_snprintf_chk
validate_arglist
simple_cst_list_equal
lookup_scoped_attribute_spec
get_attribute_namespace
fini_object_sizes


Thanks.  Diego.


Re: Re-factor tree.h - Part 1

2013-11-06 Thread Diego Novillo
On Wed, Nov 6, 2013 at 11:54 AM, Jeff Law l...@redhat.com wrote:
 On 11/06/13 00:04, Marc Glisse wrote:

 On Tue, 5 Nov 2013, Diego Novillo wrote:

 This is the first patch in a series of patches to cleanup tree.h to
 reduce the exposure it has all over the compiler.

 In this patch, I'm moving functions that are used once into the files
 that use them, and make them private to that file. These functions
 were declared extern in tree.h and called from exactly one place.


 I am not a big fan of doing it so automatically. For instance
 widest_int_cst_value should imho remain next to int_cst_value since they
 mostly share the same implementation. Doing this also doesn't promote
 code reuse: if I am looking for a function that does some basic
 operation on trees, I won't only need to look in the file that is
 semantically relevant, I'll also need to randomly grep through possible
 users to see if I should revert that part of your patch. On the other
 hand, most of those functions you move probably are better off in their
 new location, so you can ignore my post.

 What I think makes sense is to review what moved and where/why.  I suspect
 (but have not looked) that much of the movement of code makes sense.  Where
 it doesn't, then obviously we shouldn't make that change.

Ah, yes.  I just sent this list.  Perhaps that's easier to discuss
than the patch, which looks like line noise.


Thanks.  Diego.


[PATCH] Improve VRP assert creation for loops

2013-11-06 Thread Jakub Jelinek
On Tue, Nov 05, 2013 at 02:00:16PM -0800, Cong Hou wrote:
  I'm also curious -- did this code show up in a particular benchmark, if so,
  which one?
 
 I didn't find this problem from any benchmark, but from another
 concern about loop upper bound estimation. Look at the following code:
 
 int foo(unsigned int n, int r)
 {
   int i;
   if (n  0)
 if (n  4)
 {
   do {
  --n;
  r *= 2;
   } while (n  0);
 }
   return r+n;
 }
 
 
 In order to get the upper bound of the loop in this function, GCC
 traverses conditions n4 and n0 separately and tries to get any
 useful information. But as those two conditions cannot be combined
 into one due to this issue (note that n0 will be transformed into
 n!=0), when GCC sees n4, it will consider the possibility that n may
 be equal to 0, in which case the upper bound is UINT_MAX. If those two
 conditions can be combined into one, which is n-1=2, then we can get
 the correct upper bound of the loop.

I've looked at the above testcase to see why we aren't able to determine
the number of iterations upper bound properly here.

That doesn't mean your patch is useless, though I must say I'm far from
being convinced it is safe ATM and also it looks like quite ugly special
case (trying to undo a VRP optimization but only one single specific case).

The first problem is VRP, we end up with:
  bb 5:
  # n_1 = PHI n_5(D)(4), n_7(6)
  # r_3 = PHI r_6(D)(4), r_8(6)
  # RANGE [0, 4294967295]
  n_7 = n_1 + 4294967295;
  # RANGE [-2147483648, 2147483647] NONZERO 0x0fffe
  r_8 = r_3 * 2;
  if (n_7 != 0)
goto bb 6;
  else
goto bb 7;
  
  bb 6:
  goto bb 5;
- missing range on n_1, extremely conservative range on n_7.  With the
attached patch we get instead:
  bb 5:
  # RANGE [1, 3] NONZERO 0x3
  # n_1 = PHI n_5(D)(4), n_7(6)
  # r_3 = PHI r_6(D)(4), r_8(6)
  # RANGE [0, 2] NONZERO 0x3
  n_7 = n_1 + 4294967295;
  # RANGE [-2147483648, 2147483647] NONZERO 0x0fffe
  r_8 = r_3 * 2;
  if (n_7 != 0)
goto bb 6;
  else
goto bb 7;

  bb 6:
  goto bb 5;

The issue is that we use live_on_edge to determine if it is desirable
to added ASSERT_EXPRs, but as we walk bbs in RPO order, we first enter
the loop through the bb with exit edge and thus live of the latch isn't
computed (and, generally the propagation for live ignores dfs back edges
anyway), and because in the above live_on_edge ((5)-(6), n_7) is false,
we don't add ASSERT_EXPR that n_7 is != 0 in the latch bb, so during
iteration we start with n_1 being assumed [1, 3] (that is range of the
assertion from the preceeding conditions on n_5(D)), but in the next
iteration widen it to [0, 3] because we think n_7 can be [0, 2] in the
PHI and thus end up with uselessly wide range because we think the
subtraction can wrap around.  This patch improves live_on_edge for
empty latches, by just looking at the PHIs on loop-header from the
latch - header edge and noting which SSA_NAMEs are used there.

I had to add -fno-tree-vrp to 4 unroll_*.c tests, because they disable
various tree optimizations already and want to test unrolling of loops
iterating exactly twice, but with this VRP change VRP is smart enough
to replace the PHI argument on the i IV from
-  # i_13 = PHI i_8(3), 0(2)
+  # i_13 = PHI 1(3), 0(2)
(the loop iterates exactly twice) and RTL unrolling doesn't do the
tested thing in that case.  But, this should affect only loops that roll
exactly twice and those realy should be unrolled already far before, so IMHO
it isn't something to try to optimize better at the RTL level.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-11-06  Jakub Jelinek  ja...@redhat.com

* tree-vrp.c (live_on_edge): If e-dest is an empty latch
of some loop, but live[e-dest-index] is not computed, look
for SSA_NAMEs used in loop header PHIs from the latch edge.

* gcc.dg/unroll_1.c: Add -fno-tree-vrp to dg-options.
* gcc.dg/unroll_2.c: Likewise.
* gcc.dg/unroll_3.c: Likewise.
* gcc.dg/unroll_4.c: Likewise.

--- gcc/tree-vrp.c.jj   2013-11-06 08:48:01.0 +0100
+++ gcc/tree-vrp.c  2013-11-06 09:32:19.205104029 +0100
@@ -92,6 +92,42 @@ static sbitmap *live;
 static bool
 live_on_edge (edge e, tree name)
 {
+  if (live[e-dest-index] == NULL)
+{
+  /* Handle edges to empty latch blocks.  If NAME appears
+in loop header phis on edges from latch, return true
+and remember those SSA_NAMEs.  */
+  basic_block bb = e-dest;
+  if (bb-loop_father
+  bb-loop_father-latch == bb
+  single_succ_p (bb)
+  empty_block_p (bb))
+   {
+ gimple_stmt_iterator gsi;
+ edge e2 = single_succ_edge (bb);
+ for (gsi = gsi_start_phis (e2-dest);
+  !gsi_end_p (gsi);
+  gsi_next (gsi))
+   {
+ gimple phi = gsi_stmt (gsi);
+ tree res = gimple_phi_result (phi), arg;
+
+  

[PATCH] Use get_range_info during number of iterations analysis

2013-11-06 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 06:13:42PM +0100, Jakub Jelinek wrote:
 I've looked at the above testcase to see why we aren't able to determine
 the number of iterations upper bound properly here.

And here is a patch that uses get_range_info during # of iterations
analysis, so that for your testcase we can actually find out that it has
at most 2 latch executions and at most 3 header executions.

On that testcase, bound_difference is actually called with n_5(D) + 0x
and 0 and we don't have range info for n_5(D), because we had better range
info only for the SSA_NAMEs temporarily created for the assertions.
But with the previously posted patch, we have reasonable range info on the
PHI of the IV at loop-header, and as that PHI has n_5(D) as one of its
arguments (the one from preheader edge), we can still use that range info
when inside of the loop for # of iters purposes, even when it might be
wider range than strictly necessary (but not in this case).
So, the patch looks at both get_range_info of var itself and of all the
results of PHIs on loop-header that have var as PHI arg from the preheader
edge, and ands all the ranges together.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-11-06  Jakub Jelinek  ja...@redhat.com

* tree-ssa-loop-niter.c: Include tree-ssanames.h.
(determine_value_range): Add loop argument.  Use get_range_info to
improve range.
(bound_difference): Adjust caller.

--- gcc/tree-ssa-loop-niter.c.jj2013-10-24 10:19:21.0 +0200
+++ gcc/tree-ssa-loop-niter.c   2013-11-06 14:37:47.445220588 +0100
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.
 #include diagnostic-core.h
 #include tree-inline.h
 #include tree-pass.h
+#include tree-ssanames.h
 
 
 #define SWAP(X, Y) do { affine_iv *tmp = (X); (X) = (Y); (Y) = tmp; } while (0)
@@ -119,9 +120,12 @@ split_to_var_and_offset (tree expr, tree
in TYPE to MIN and MAX.  */
 
 static void
-determine_value_range (tree type, tree var, mpz_t off,
+determine_value_range (struct loop *loop, tree type, tree var, mpz_t off,
   mpz_t min, mpz_t max)
 {
+  double_int minv, maxv;
+  enum value_range_type rtype = VR_VARYING;
+
   /* If the expression is a constant, we know its value exactly.  */
   if (integer_zerop (var))
 {
@@ -130,9 +134,73 @@ determine_value_range (tree type, tree v
   return;
 }
 
+  get_type_static_bounds (type, min, max);
+
+  /* See if we have some range info from VRP.  */
+  if (TREE_CODE (var) == SSA_NAME  INTEGRAL_TYPE_P (type))
+{
+  edge e = loop_preheader_edge (loop);
+  gimple_stmt_iterator gsi;
+
+  /* Either for VAR itself...  */
+  rtype = get_range_info (var, minv, maxv);
+  /* Or for PHI results in loop-header where VAR is used as
+PHI argument from the loop preheader edge.  */
+  for (gsi = gsi_start_phis (loop-header); !gsi_end_p (gsi); gsi_next 
(gsi))
+   {
+ gimple phi = gsi_stmt (gsi);
+ double_int minc, maxc;
+ if (PHI_ARG_DEF_FROM_EDGE (phi, e) == var
+  (get_range_info (gimple_phi_result (phi), minc, maxc)
+ == VR_RANGE))
+   {
+ if (rtype != VR_RANGE)
+   {
+ rtype = VR_RANGE;
+ minv = minc;
+ maxv = maxc;
+   }
+ else
+   {
+ minv = minv.max (minc, TYPE_UNSIGNED (type));
+ maxv = maxv.min (maxc, TYPE_UNSIGNED (type));
+ gcc_assert (minv.cmp (maxv, TYPE_UNSIGNED (type)) = 0);
+   }
+   }
+   }
+  if (rtype == VR_RANGE)
+   {
+ mpz_t minm, maxm;
+ gcc_assert (minv.cmp (maxv, TYPE_UNSIGNED (type)) = 0);
+ mpz_init (minm);
+ mpz_init (maxm);
+ mpz_set_double_int (minm, minv, TYPE_UNSIGNED (type));
+ mpz_set_double_int (maxm, maxv, TYPE_UNSIGNED (type));
+ mpz_add (minm, minm, off);
+ mpz_add (maxm, maxm, off);
+ /* If the computation may not wrap or off is zero, then this
+is always fine.  If off is negative and minv + off isn't
+smaller than type's minimum, or off is positive and
+maxv + off isn't bigger than type's maximum, use the more
+precise range too.  */
+ if (nowrap_type_p (type)
+ || mpz_sgn (off) == 0
+ || (mpz_sgn (off)  0  mpz_cmp (minm, min) = 0)
+ || (mpz_sgn (off)  0  mpz_cmp (maxm, max) = 0))
+   {
+ mpz_set (min, minm);
+ mpz_set (max, maxm);
+ mpz_clear (minm);
+ mpz_clear (maxm);
+ return;
+   }
+ mpz_clear (minm);
+ mpz_clear (maxm);
+   }
+}
+
   /* If the computation may wrap, we know nothing about the value, except for
  the range of the type.  */
-  get_type_static_bounds (type, min, max);
   if (!nowrap_type_p (type))
 

[PATCH] Fix PR58697

2013-11-06 Thread Richard Biener

This fixes PR58697 by instead of trying to preserve reference structure
when generating initial loads for the predictive commoned loop, just
builds an approprate MEM_REF.  That now can handle the cases that
previously could be discovered as non-trapping (all constant offsets).

I don't see a better way avoiding the out-of-bounds array access
it currently generates (with preserving the transformation and not
regressing other testcases).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2013-11-06  Richard Biener  rguent...@suse.de

PR tree-optimization/58653
* tree-predcom.c (ref_at_iteration): Rewrite to generate
a MEM_REF.
(prepare_initializers_chain): Adjust.

* gcc.dg/tree-ssa/predcom-6.c: New testcase.
* gcc.dg/tree-ssa/predcom-7.c: Likewise.

Index: gcc/tree-predcom.c
===
*** gcc/tree-predcom.c  (revision 204396)
--- gcc/tree-predcom.c  (working copy)
*** replace_ref_with (gimple stmt, tree new_
*** 1323,1412 
gsi_insert_after (bsi, new_stmt, GSI_NEW_STMT);
  }
  
! /* Returns the reference to the address of REF in the ITER-th iteration of
!LOOP, or NULL if we fail to determine it (ITER may be negative).  We
!try to preserve the original shape of the reference (not rewrite it
!as an indirect ref to the address), to make tree_could_trap_p in
!prepare_initializers_chain return false more often.  */
  
! static tree
! ref_at_iteration (struct loop *loop, tree ref, int iter)
  {
!   tree idx, *idx_p, type, val, op0 = NULL_TREE, ret;
!   affine_iv iv;
!   bool ok;
! 
!   if (handled_component_p (ref))
! {
!   op0 = ref_at_iteration (loop, TREE_OPERAND (ref, 0), iter);
!   if (!op0)
!   return NULL_TREE;
! }
!   else if (!INDIRECT_REF_P (ref)
!   TREE_CODE (ref) != MEM_REF)
! return unshare_expr (ref);
! 
!   if (TREE_CODE (ref) == MEM_REF)
! {
!   ret = unshare_expr (ref);
!   idx = TREE_OPERAND (ref, 0);
!   idx_p = TREE_OPERAND (ret, 0);
! }
!   else if (TREE_CODE (ref) == COMPONENT_REF)
! {
!   /* Check that the offset is loop invariant.  */
!   if (TREE_OPERAND (ref, 2)
!  !expr_invariant_in_loop_p (loop, TREE_OPERAND (ref, 2)))
!   return NULL_TREE;
! 
!   return build3 (COMPONENT_REF, TREE_TYPE (ref), op0,
!unshare_expr (TREE_OPERAND (ref, 1)),
!unshare_expr (TREE_OPERAND (ref, 2)));
! }
!   else if (TREE_CODE (ref) == ARRAY_REF)
! {
!   /* Check that the lower bound and the step are loop invariant.  */
!   if (TREE_OPERAND (ref, 2)
!  !expr_invariant_in_loop_p (loop, TREE_OPERAND (ref, 2)))
!   return NULL_TREE;
!   if (TREE_OPERAND (ref, 3)
!  !expr_invariant_in_loop_p (loop, TREE_OPERAND (ref, 3)))
!   return NULL_TREE;
! 
!   ret = build4 (ARRAY_REF, TREE_TYPE (ref), op0, NULL_TREE,
!   unshare_expr (TREE_OPERAND (ref, 2)),
!   unshare_expr (TREE_OPERAND (ref, 3)));
!   idx = TREE_OPERAND (ref, 1);
!   idx_p = TREE_OPERAND (ret, 1);
! }
!   else
! return NULL_TREE;
! 
!   ok = simple_iv (loop, loop, idx, iv, true);
!   if (!ok)
! return NULL_TREE;
!   iv.base = expand_simple_operations (iv.base);
!   if (integer_zerop (iv.step))
! *idx_p = unshare_expr (iv.base);
else
! {
!   type = TREE_TYPE (iv.base);
!   if (POINTER_TYPE_P (type))
!   {
! val = fold_build2 (MULT_EXPR, sizetype, iv.step,
!size_int (iter));
! val = fold_build_pointer_plus (iv.base, val);
!   }
!   else
!   {
! val = fold_build2 (MULT_EXPR, type, iv.step,
!build_int_cst_type (type, iter));
! val = fold_build2 (PLUS_EXPR, type, iv.base, val);
!   }
!   *idx_p = unshare_expr (val);
! }
! 
!   return ret;
  }
  
  /* Get the initialization expression for the INDEX-th temporary variable
--- 1323,1351 
gsi_insert_after (bsi, new_stmt, GSI_NEW_STMT);
  }
  
! /* Returns a memory reference to DR in the ITER-th iteration of
!the loop it was analyzed in.  Append init stmts to STMTS.  */
  
! static tree 
! ref_at_iteration (data_reference_p dr, int iter, gimple_seq *stmts)
  {
!   tree off = DR_OFFSET (dr);
!   tree coff = DR_INIT (dr);
!   if (iter == 0)
! ;
!   else if (TREE_CODE (DR_STEP (dr)) == INTEGER_CST)
! coff = size_binop (PLUS_EXPR, coff,
!  size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
else
! off = size_binop (PLUS_EXPR, off,
! size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
!   tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
!   addr = force_gimple_operand_1 (addr, stmts, is_gimple_mem_ref_addr,
!NULL_TREE);
!   return fold_build2 (MEM_REF, TREE_TYPE (DR_REF (dr)),
! 

Re: [PATCH v2] Fix libgfortran cross compile configury w.r.t newlib

2013-11-06 Thread Marcus Shawcroft
On 24 October 2013 17:47, Steve Ellcey sell...@mips.com wrote:

 I am not sure how we would fix the build issue to allow us to not
 hardcode the newlib configure details into the libgfortran configure
 script.  The linker script that needs to be used to get a good link is
 different depending on what options one is compiling with on a multilib
 MIPS system and I have no idea on how one could create (or use) a dummy
 linker script.  Ideally, I think we would want a check that does not
 depend on linking at all.

 Note that this problem is not just in libgfortran, it affects libstdc++
 and libjava too and those configure scripts also have hardcoded the
 newlib assumptions.  The only difference between libgfortran and the
 other two libraries is that the newlib assumptions for libgfortran are
 not static like they are for libstdc++ and libjava.  They vary (for one
 function) based on whether or not long double is supported.

Exactly, hard wiring the newlib interface into the configury of other
libraries is questionable, but the patch applied to libgfortran goes
beyond hard wiring details of the interface, it hard wires incorrect
details of the interface when sizeof(long double) != sizeof(double).

The argument that the patch is OK because libjava and libstdc++ also
use this idiom, is also rather questionable, because the libgfortran
patch goes beyond the idiom used in those other libraries by hard
wiring an assumption that does not hold universally.

On that basis, I think the the libgfortran patch should be reverted
since it caused a regression.

/Marcus


Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Michael Matz
Hi,

On Wed, 6 Nov 2013, Ilya Enkovich wrote:

  Well, but clearly you can see that bounds for p == NULL are
  useless and thus there is no problem with the transform.
 
  This example just demonstrates the issue. I may use some var's address
  in comparison with the similar result. And value does not always allow
  to determine bounds, because pointers with the same value may have
  different bounds.
 
  Some vars address can be used for better bounds deriving though.
  Well, it clearly shows it's a get me good results issue and not
  a correctness issue.  And for good results you trade in missed
  general optimizations.  That doesn't sound very good to me.
 
 That is definitely a correctness issue.  Compiler cannot assume bounds
 by pointer value ignoring bounds passed to the function.

Of course they can.  They assume unbounded for unknown bounds and emit 
correct code.  It's merely more conservative than you'd like (as the 
function got passed tighter bounds than unbounded), and hence no 
correctness but a QoI problem.


Ciao,
Michael.


Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-06 Thread Richard Sandiford
Tejas Belagod tbela...@arm.com writes:
 Richard Sandiford wrote:
 Tejas Belagod tbela...@arm.com writes:
 Richard Sandiford wrote:
 Tejas Belagod tbela...@arm.com writes:
 +  /* This is big-endian-safe because the elements are kept in target
 + memory order.  So, for eg. PARALLEL element value of 2 is the same 
 in
 + either endian-ness.  */
 +  if (GET_CODE (src) == VEC_SELECT
 +   REG_P (XEXP (src, 0))  REG_P (dst)
 +   REGNO (XEXP (src, 0)) == REGNO (dst))
 +{
 +  rtx par = XEXP (src, 1);
 +  int i;
 +
 +  for (i = 0; i  XVECLEN (par, 0); i++)
 + {
 +   rtx tem = XVECEXP (par, 0, i);
 +   if (!CONST_INT_P (tem) || INTVAL (tem) != i)
 + return 0;
 + }
 +  return 1;
 +}
 +
 I think for big endian it needs to be:

 INTVAL (tem) != i + base

 where base is something like:

 int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 
 0);

 E.g. a big-endian V4HI looks like:

 msb  lsb
 

 and shortening it to say V2HI only gives the low 32 bits:

 msb  lsb
 
 But, in this case we want

  msb  lsb
  
 
 It depends on whether the result occupies a full register or not.
 I was thinking of the case where it didn't, but I realise now you were
 thinking of the case where it did.  And yeah, my suggestion doesn't
 cope with that...
 
 I was under the impression that the const vector parallel for vec_select 
 represents the element indexes of the array in memory order.

 Therefore, in bigendian,

   msb lsb
      
 element  a[0] a[1] a[2] a[3]

 and in littleendian

   msb lsb
      
 element  a[3] a[2] a[1] a[0]
 
 Right.  But if an N-bit value is stored in a register, it's assumed to
 occupy the lsb of the register and the N-1 bits above that.  The other
 bits in the register are don't-care.
 
 E.g., leaving vectors to one side, if you have:
 
(set (reg:HI N) (truncate:SI (reg:SI N)))
 
 on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:
 
msb  lsb
01234567

4567
 
 rather than:
 
msb  lsb
01234567

0123
 
 for both endiannesses.  The same principle applies to vectors.
 The lsb of the register is always assumed to be significant.
 
 So maybe the original patch was correct for partial-register and
 full-register results on little-endian, but only for full-register
 results on big-endian.

 Ah, ok! I think I get it. By eliminating
   set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]

 using the check INTVAL (tem) != i, I'm essentially making subsequent 
 operations 
 use (reg:V2DI n) in DI mode which is a partial register result and this
 gives me
 the wrong set of lanes in bigendian. So, if I want to use (reg n) in partial 
 register mode, I have to make sure the correct elements coincide with
 the lsb in
 big-endian...

 Thanks for your input, I'll apply the offset correction for big-endian you 
 suggested. I'll respin the patch.

Thanks.  Just for avoidance of doubt, the result might be a full or
partial register, depending on the mode and target.  I was trying to
correct myself by agreeing that your original was right and mine was
wrong for big-endian if the result is a full register.

I don't know if there are existing helper functions for this kind of thing.

Richard


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Tom Tromey
 Jakub == Jakub Jelinek ja...@redhat.com writes:

Jakub 1) p debug_tree (0x718b3690)
Jakubnot working anymore, typically one would use this when just
Jakubcut'n'pasting addresses seen in backtrace, other debug_tree
Jakubetc., one now has to add explicitly a cast p debug_tree
Jakub((tree)0x718b3690).  Perhaps this can be cured just by
Jakubadding extra overloads for a few most popular debug_* routines
Jakubthat weren't initially overloaded, add them an overload with
Jakubuintptr_t type or similar that will just cast it to the right
Jakubtype and call the original routine

Last time this came up, we added the feature to gdb.  You have to enable
it, though.  Use set check type off.

Tom


Clean up configure glibc version detection, add --with-glibc-version

2013-11-06 Thread Joseph S. Myers
When bootstrapping a cross toolchain with GCC and glibc, it's
desirable to keep down the number of GCC builds needed: to be able to
build an initial static-only C-only GCC, use that to build glibc, and
then build the full compiler with the resulting glibc.  The aim is
that if glibc were then rebuilt with the full compiler, the results
would be identical to the glibc built with the initial compiler.  (See
https://sourceware.org/ml/libc-alpha/2012-03/msg00960.html for more
on how ideally this might work; really it should only be target
libraries that need rebuilding, not the compiler at all.)

In support of this, I put various changes into glibc 2.17, and other
changes into GCC 4.8 to make static libgcc sufficiently similar in an
initial bootstrap build and a full build so as not to affect the
resulting glibc binaries (see
http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01219.html and
http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01462.html).  However,
while this solved the problem of stable bootstrapped glibc binaries
with just two GCC builds for ARM, it did not do so for all
architectures.  Specifically, for architectures with
architecture-specific stack-protector code in GCC, the first bootstrap
GCC gets configured thinking glibc lacks the corresponding glibc
support, meaning it is configured in a way incompatible with building
glibc (but compatible with installing glibc headers if you do a
three-compiler bootstrap, with a second compiler built after glibc was
configured with the first compiler to install glibc headers, that
second compiler used to build glibc, and a third compiler built after
that).

That can be worked around by setting gcc_cv_libc_provides_ssp=yes when
configuring GCC, but setting an internal configure variable like that
shouldn't be needed.  Instead, this patch provides a properly
documented configure option --with-glibc-version to use to specify the
minimum glibc version a given compiler will target, for this case and
also for more obscure cases where the autodetection may not work
(e.g. if a sysroot headers suffix is used so the headers aren't found
in the location expected by the configure tests).

Different configure tests detected and used glibc versions in
different ways.  The stack-protector test checked for a glibc version
number in headers, the long-double-128 test checked for target
architecture and __LONG_DOUBLE_MATH_OPTIONAL, and the
gnu-unique-object test used ldd --version and so only worked at all
for native toolchains.  After this patch, the configure script sets up
glibc_version_major and glibc_version_minor variables (from the
configure option if used, or headers if not used), and all those tests
use those variables (in addition to whatever tests not of glibc
version numbers are still relevant - __LONG_DOUBLE_MATH_OPTIONAL is
kept as a fallback in the long-double-128 tests because uClibc defines
it but pretends to be glibc 2.2).  Thus, a documented configure option
can be used to get correct stack-protector support in a bootstrap
compiler, while the fixed gnu-unique-object test works for cross
compilation.  Any future cases where GCC needs to know the target
glibc version number can also use the same configuration logic.

Note that this patch does *not* address the --enable-initfini-array
differences between native and cross compilation.  I believe Thomas
intends to deal with those (in GCC and elsewhere in the toolchain).  I
think there the correct default is to enable the feature by default
for ELF targets (since it's about using a standard feature of ELF) and
then blacklist (if necessary with version number checks) particular
targets found to be broken, without doing anything special for native
tools.

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  OK to
commit?

2013-11-06  Joseph Myers  jos...@codesourcery.com

* acinclude.m4 (GCC_GLIBC_VERSION_GTE_IFELSE): New configure
macro.
* configure.ac: Determine target_header_dir earlier.
(--with-glibc-version): New configure option.
Use GCC_GLIBC_VERSION_GTE_IFELSE in enable_gnu_unique_object,
gcc_cv_libc_provides_ssp and gcc_cv_target_ldbl128 tests.
* configure: Regenerate.
* doc/install.texi (--enable-gnu-unique-object): Don't refer to
native toolchains for default.
(--with-glibc-version): Document.

Index: gcc/acinclude.m4
===
--- gcc/acinclude.m4(revision 204453)
+++ gcc/acinclude.m4(working copy)
@@ -561,3 +561,18 @@ dnl Make sure that build_exeext is looked for
 AC_DEFUN([gcc_AC_BUILD_EXEEXT], [
 ac_executable_extensions=$build_exeext])
 
+dnl GCC_GLIBC_VERSION_GTE_IFELSE(MAJOR, MINOR, IF-TRUE, IF-FALSE)
+dnl -
+dnl If the target glibc version ($glibc_version_major.$glibc_version_minor)
+dnl is at least MAJOR.MINOR, call IF-TRUE, otherwise call IF-FALSE.
+AC_DEFUN([GCC_GLIBC_VERSION_GTE_IFELSE],

Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-06 Thread Ilya Enkovich
2013/11/6 Michael Matz m...@suse.de:
 Hi,

 On Wed, 6 Nov 2013, Ilya Enkovich wrote:

  Well, but clearly you can see that bounds for p == NULL are
  useless and thus there is no problem with the transform.
 
  This example just demonstrates the issue. I may use some var's address
  in comparison with the similar result. And value does not always allow
  to determine bounds, because pointers with the same value may have
  different bounds.
 
  Some vars address can be used for better bounds deriving though.
  Well, it clearly shows it's a get me good results issue and not
  a correctness issue.  And for good results you trade in missed
  general optimizations.  That doesn't sound very good to me.

 That is definitely a correctness issue.  Compiler cannot assume bounds
 by pointer value ignoring bounds passed to the function.

 Of course they can.  They assume unbounded for unknown bounds and emit
 correct code.  It's merely more conservative than you'd like (as the
 function got passed tighter bounds than unbounded), and hence no
 correctness but a QoI problem.

Well, it depends on what we call 'correct' behavior for checker. I
suppose that bound violation caught on -O0 and not caught on -O2 is a
compiler issue. That is why I call it correctness issue. When you
instrument your code you intentionally slow your program down to
improve security. Compiler should have strong reason to behave
contrary. Of course, there should be some trade off. But in this
particular case I do not see why we should make pointer unchecked. If
I just drop this patch and use undefined bounds in all cases arg_bnd
call is optimized, I wont get any performance gain.

Ilya



 Ciao,
 Michael.


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Jeff Law

On 11/06/13 11:04, Tom Tromey wrote:

Jakub == Jakub Jelinek ja...@redhat.com writes:


Jakub 1) p debug_tree (0x718b3690)
Jakubnot working anymore, typically one would use this when just
Jakubcut'n'pasting addresses seen in backtrace, other debug_tree
Jakubetc., one now has to add explicitly a cast p debug_tree
Jakub((tree)0x718b3690).  Perhaps this can be cured just by
Jakubadding extra overloads for a few most popular debug_* routines
Jakubthat weren't initially overloaded, add them an overload with
Jakubuintptr_t type or similar that will just cast it to the right
Jakubtype and call the original routine

Last time this came up, we added the feature to gdb.  You have to enable
it, though.  Use set check type off.
I'd been mildly annoyed by this behaviour, but not so much as to look 
for a solution and I can certainly see why gdb is doing what it's doing. 
 I certainly don't consider this something that should stand in the way 
of moving forward.


ISTM that one liner belongs in GCC's .gdbinit.  Until then, I'm adding 
it to my own :-)





jeff


Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-11-06 Thread Bill Schmidt
On Wed, 2013-11-06 at 17:34 +, Richard Sandiford wrote:
 Tejas Belagod tbela...@arm.com writes:
  Richard Sandiford wrote:
  Tejas Belagod tbela...@arm.com writes:
  Richard Sandiford wrote:
  Tejas Belagod tbela...@arm.com writes:
  +  /* This is big-endian-safe because the elements are kept in target
  + memory order.  So, for eg. PARALLEL element value of 2 is the 
  same in
  + either endian-ness.  */
  +  if (GET_CODE (src) == VEC_SELECT
  +   REG_P (XEXP (src, 0))  REG_P (dst)
  +   REGNO (XEXP (src, 0)) == REGNO (dst))
  +{
  +  rtx par = XEXP (src, 1);
  +  int i;
  +
  +  for (i = 0; i  XVECLEN (par, 0); i++)
  +   {
  + rtx tem = XVECEXP (par, 0, i);
  + if (!CONST_INT_P (tem) || INTVAL (tem) != i)
  +   return 0;
  +   }
  +  return 1;
  +}
  +
  I think for big endian it needs to be:
 
  INTVAL (tem) != i + base
 
  where base is something like:
 
  int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN 
  (par, 0);
 
  E.g. a big-endian V4HI looks like:
 
  msb  lsb
  
 
  and shortening it to say V2HI only gives the low 32 bits:
 
  msb  lsb
  
  But, in this case we want
 
   msb  lsb
   
  
  It depends on whether the result occupies a full register or not.
  I was thinking of the case where it didn't, but I realise now you were
  thinking of the case where it did.  And yeah, my suggestion doesn't
  cope with that...
  
  I was under the impression that the const vector parallel for vec_select 
  represents the element indexes of the array in memory order.
 
  Therefore, in bigendian,
 
msb lsb
   
  element  a[0] a[1] a[2] a[3]
 
  and in littleendian
 
msb lsb
   
  element  a[3] a[2] a[1] a[0]
  
  Right.  But if an N-bit value is stored in a register, it's assumed to
  occupy the lsb of the register and the N-1 bits above that.  The other
  bits in the register are don't-care.
  
  E.g., leaving vectors to one side, if you have:
  
 (set (reg:HI N) (truncate:SI (reg:SI N)))
  
  on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:
  
 msb  lsb
 01234567
 
 4567
  
  rather than:
  
 msb  lsb
 01234567
 
 0123
  
  for both endiannesses.  The same principle applies to vectors.
  The lsb of the register is always assumed to be significant.
  
  So maybe the original patch was correct for partial-register and
  full-register results on little-endian, but only for full-register
  results on big-endian.
 
  Ah, ok! I think I get it. By eliminating
  set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]
 
  using the check INTVAL (tem) != i, I'm essentially making subsequent 
  operations 
  use (reg:V2DI n) in DI mode which is a partial register result and this
  gives me
  the wrong set of lanes in bigendian. So, if I want to use (reg n) in 
  partial 
  register mode, I have to make sure the correct elements coincide with
  the lsb in
  big-endian...
 
  Thanks for your input, I'll apply the offset correction for big-endian you 
  suggested. I'll respin the patch.
 
 Thanks.  Just for avoidance of doubt, the result might be a full or
 partial register, depending on the mode and target.  I was trying to
 correct myself by agreeing that your original was right and mine was
 wrong for big-endian if the result is a full register.
 
 I don't know if there are existing helper functions for this kind of thing.

Tejas, can you please include test cases for both possibilities?  The
existing test suite is not sufficient (your original patch does not
demonstrate regressions on powerpc64 big endian, even though we know
it's not correct).  Thanks!

Bill

 
 Richard
 



Re: [C++ Patch, Java related/RFC] PR 11006

2013-11-06 Thread Jason Merrill

On 11/06/2013 05:42 AM, Andrew Haley wrote:

On 11/04/2013 05:21 PM, Jason Merrill wrote:

Surely it should be valid to allocate a Java boolean type.  Andrew,
how should that work?


It's not allowed.  All objects that are allocated by new must be of
class type (i.e.  instances of a subclass of java.lang.Object), but
boolean is a primitive type.


In that case, Paolo's first patch is OK.

Jason



Initial submission of OpenACC support integrated into OpenMP's lowering and expansion passes

2013-11-06 Thread Thomas Schwinge
Hi!

Here is our take of adding OpenACC support to GCC's front end and middle
end.  What I post here (that is, as replies to this email, to gcc-patches
only) is still very incomplete, but hopefully enough to outline the
general approach, and I'd like to commit these patches to gomp-4_0-branch
-- OK?  Also, may I additionally commit those patches to trunk that don't
have anything specific to OpenACC in them (such as the first two
preparation patches), which I assume will save some work when merging
between trunk and gomp-4_0-branch later on?

Due to the conceptual similarity compared to OpenMP, and that (later) it
is reasonable to expect to embed OpenACC directives into OpenMP ones, the
approach I've chosen is to directly embed OpenACC handling into the
existing OpenMP infrastructure/passes.  I had first begun implementing
separate OpenACC lowering and expansion passes, but then found myself
re-implementing/copying more and more of the existing OpenMP passes'
code, realized that makes no sense, and we shall instead handle these
directly together in one set of passes -- which also happens to
facilitate the interoperation requirement.

For the same reasons, OpenACC's runtime library shall be implemented in
libgomp next to OpenMP's.  Jakub's already approved this approach
generally.  This patch series doesn't contain any substantial rumtime
library work yet; the last patch adds GOACC_parallel, which so far simply
branches to GOMP_target.  There's more to come.

This is in contrast to Samsung's work, who are implementing OpenACC
separately from the existing OpenMP support.  Yet, I hope we'll be able
to share/re-use at least front end and some middle end code.

Our work builds on what has (briefly) been discussed before, for example
at the GNU Tools Cauldron's Accelerator BOF in this summer.  For the
moment, while still getting the concepts clear, I'm adding new
GENERIC/GIMPLE codes and their handling for OpenACC constructs (for the
implementation of OpenACC parallel borrowing from OpenMP target) -- the
long-term goal still is to evolve all these into general acceleration
abstractions, evolve the GCC OpenMP infrastructure and passes into
general acceleration passes.  As discussed, it makes sense to do this
generalization work later on, once we have a better understanding of what
we actually need.

We directly strive for OpenACC 2.0 support, skipping OpenACC 1.  We're
focussing on the C front end implementation first, following on with C++
and Fortran later on.

The patches I send here only contain a bare minimum of documentation
changes.  I already have further documentation changes prepared (source
code comments as well as Texinfo manual updates); will polish and send
these only once the approach we've chosen has generally been approved.

Comments on the approach and/or patches?


Grüße,
 Thomas


pgpYa4_ZiFyzF.pgp
Description: PGP signature


[gomp4 1/9] Add missing include.

2013-11-06 Thread thomas
From: Thomas Schwinge tho...@codesourcery.com

libgomp/
* libgomp_g.h: Include stddef.h for size_t.
---
 libgomp/libgomp_g.h | 1 +
 1 file changed, 1 insertion(+)

diff --git libgomp/libgomp_g.h libgomp/libgomp_g.h
index 32c4cf6..577956a 100644
--- libgomp/libgomp_g.h
+++ libgomp/libgomp_g.h
@@ -29,6 +29,7 @@
 #define LIBGOMP_G_H 1
 
 #include stdbool.h
+#include stddef.h
 
 /* barrier.c */
 
-- 
1.8.1.1



[gomp4 2/9] libgomp: Prepare for testcases without -fopenmp.

2013-11-06 Thread thomas
From: Thomas Schwinge tho...@codesourcery.com

libgomp/
* testsuite/lib/libgomp.exp (libgomp_init): Don't add -fopenmp to
ALWAYS_CFLAGS.
* testsuite/libgomp.c++/c++.exp (ALWAYS_CFLAGS): Add -fopenmp.
* testsuite/libgomp.c/c.exp (ALWAYS_CFLAGS): Likewise.
* testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Likewise.
* testsuite/libgomp.graphite/graphite.exp (ALWAYS_CFLAGS):
Likewise.
---
 libgomp/testsuite/lib/libgomp.exp   | 3 ---
 libgomp/testsuite/libgomp.c++/c++.exp   | 3 +++
 libgomp/testsuite/libgomp.c/c.exp   | 3 +++
 libgomp/testsuite/libgomp.fortran/fortran.exp   | 3 +++
 libgomp/testsuite/libgomp.graphite/graphite.exp | 3 +++
 5 files changed, 12 insertions(+), 3 deletions(-)

diff --git libgomp/testsuite/lib/libgomp.exp libgomp/testsuite/lib/libgomp.exp
index d1d8bc8..c965147 100644
--- libgomp/testsuite/lib/libgomp.exp
+++ libgomp/testsuite/lib/libgomp.exp
@@ -169,9 +169,6 @@ proc libgomp_init { args } {
 
 # Disable color diagnostics
 lappend ALWAYS_CFLAGS additional_flags=-fdiagnostics-color=never
-
-# And, gee, turn on OpenMP.
-lappend ALWAYS_CFLAGS additional_flags=-fopenmp
 }
 
 #
diff --git libgomp/testsuite/libgomp.c++/c++.exp 
libgomp/testsuite/libgomp.c++/c++.exp
index b336306..88e017e 100644
--- libgomp/testsuite/libgomp.c++/c++.exp
+++ libgomp/testsuite/libgomp.c++/c++.exp
@@ -11,6 +11,9 @@ set lang_library_path ../libstdc++-v3/src/.libs
 # Initialize dg.
 dg-init
 
+# Turn on OpenMP.
+lappend ALWAYS_CFLAGS additional_flags=-fopenmp
+
 set blddir [lookfor_file [get_multilibs] libgomp]
 
 
diff --git libgomp/testsuite/libgomp.c/c.exp libgomp/testsuite/libgomp.c/c.exp
index 7dfdf8b..8e902d4 100644
--- libgomp/testsuite/libgomp.c/c.exp
+++ libgomp/testsuite/libgomp.c/c.exp
@@ -17,6 +17,9 @@ if ![info exists DEFAULT_CFLAGS] then {
 # Initialize dg.
 dg-init
 
+# Turn on OpenMP.
+lappend ALWAYS_CFLAGS additional_flags=-fopenmp
+
 # Gather a list of all tests.
 set tests [lsort [find $srcdir/$subdir *.c]]
 
diff --git libgomp/testsuite/libgomp.fortran/fortran.exp 
libgomp/testsuite/libgomp.fortran/fortran.exp
index b7fef29..e0bffe3 100644
--- libgomp/testsuite/libgomp.fortran/fortran.exp
+++ libgomp/testsuite/libgomp.fortran/fortran.exp
@@ -15,6 +15,9 @@ set quadmath_library_path ../libquadmath/.libs
 # Initialize dg.
 dg-init
 
+# Turn on OpenMP.
+lappend ALWAYS_CFLAGS additional_flags=-fopenmp
+
 if { $blddir !=  } {
 lappend ALWAYS_CFLAGS additional_flags=-fintrinsic-modules-path=${blddir}
 # Look for a static libgfortran first.
diff --git libgomp/testsuite/libgomp.graphite/graphite.exp 
libgomp/testsuite/libgomp.graphite/graphite.exp
index 08aa509..9129964 100644
--- libgomp/testsuite/libgomp.graphite/graphite.exp
+++ libgomp/testsuite/libgomp.graphite/graphite.exp
@@ -42,6 +42,9 @@ set PARALLEL_CFLAGS -ansi -pedantic-errors -O2 \
 # Initialize `dg'.
 dg-init
 
+# Turn on OpenMP.
+lappend ALWAYS_CFLAGS additional_flags=-fopenmp
+
 # Gather a list of all tests.
 set tests [lsort [find $srcdir/$subdir *.c]]
 
-- 
1.8.1.1



[gomp4 5/9] OpenACC: preprocessor definition, Fortran integer parameter.

2013-11-06 Thread thomas
From: Thomas Schwinge tho...@codesourcery.com

gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Conditionally define _OPENACC.
gcc/fortran/
* cpp.c (cpp_define_builtins): Conditionally define _OPENACC.
gcc/testsuite/
* c-c++-common/cpp/openacc-define-1.c: Test _OPENACC.
* c-c++-common/cpp/openacc-define-2.c: Likewise.
* c-c++-common/cpp/openacc-define-3.c: Likewise.
* gfortran.dg/openacc-define-1.f90: Likewise.
* gfortran.dg/openacc-define-2.f90: Likewise.
* gfortran.dg/openacc-define-3.f90: Likewise.
libgomp/
* openacc.f90 (openacc_version): New integer parameter.
* openacc_lib.h (openacc_version): Likewise.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: New file.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.
---
 gcc/c-family/c-cppbuiltin.c  | 3 +++
 gcc/fortran/cpp.c| 3 +++
 gcc/testsuite/c-c++-common/cpp/openacc-define-1.c| 4 
 gcc/testsuite/c-c++-common/cpp/openacc-define-2.c| 4 
 gcc/testsuite/c-c++-common/cpp/openacc-define-3.c| 8 
 gcc/testsuite/gfortran.dg/openacc-define-1.f90   | 4 
 gcc/testsuite/gfortran.dg/openacc-define-2.f90   | 4 
 gcc/testsuite/gfortran.dg/openacc-define-3.f90   | 8 
 libgomp/openacc.f90  | 2 ++
 libgomp/openacc_lib.h| 3 +++
 libgomp/testsuite/libgomp.oacc-fortran/openacc_version-1.f   | 9 +
 libgomp/testsuite/libgomp.oacc-fortran/openacc_version-2.f90 | 9 +
 12 files changed, 61 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/openacc_version-1.f
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/openacc_version-2.f90

diff --git gcc/c-family/c-cppbuiltin.c gcc/c-family/c-cppbuiltin.c
index ed4c82c..d48d96f 100644
--- gcc/c-family/c-cppbuiltin.c
+++ gcc/c-family/c-cppbuiltin.c
@@ -895,6 +895,9 @@ c_cpp_builtins (cpp_reader *pfile)
   else if (flag_stack_protect == 1)
 cpp_define (pfile, __SSP__=1);
 
+  if (flag_openacc)
+cpp_define (pfile, _OPENACC=201306);
+
   if (flag_openmp)
 cpp_define (pfile, _OPENMP=201307);
 
diff --git gcc/fortran/cpp.c gcc/fortran/cpp.c
index ea53681..58f6cc9 100644
--- gcc/fortran/cpp.c
+++ gcc/fortran/cpp.c
@@ -169,6 +169,9 @@ cpp_define_builtins (cpp_reader *pfile)
   cpp_define (pfile, __GFORTRAN__=1);
   cpp_define (pfile, _LANGUAGE_FORTRAN=1);
 
+  if (gfc_option.gfc_flag_openacc)
+cpp_define (pfile, _OPENACC=201306);
+
   if (gfc_option.gfc_flag_openmp)
 cpp_define (pfile, _OPENMP=201107);
 
diff --git gcc/testsuite/c-c++-common/cpp/openacc-define-1.c 
gcc/testsuite/c-c++-common/cpp/openacc-define-1.c
index feaf778..cd37548 100644
--- gcc/testsuite/c-c++-common/cpp/openacc-define-1.c
+++ gcc/testsuite/c-c++-common/cpp/openacc-define-1.c
@@ -1,2 +1,6 @@
 /* { dg-do preprocess } */
 /* { dg-require-effective-target fopenacc } */
+
+#ifdef _OPENACC
+# error _OPENACC defined
+#endif
diff --git gcc/testsuite/c-c++-common/cpp/openacc-define-2.c 
gcc/testsuite/c-c++-common/cpp/openacc-define-2.c
index a2f3e28..b007e32 100644
--- gcc/testsuite/c-c++-common/cpp/openacc-define-2.c
+++ gcc/testsuite/c-c++-common/cpp/openacc-define-2.c
@@ -1,3 +1,7 @@
 /* { dg-options -fno-openacc } */
 /* { dg-do preprocess } */
 /* { dg-require-effective-target fopenacc } */
+
+#ifdef _OPENACC
+# error _OPENACC defined
+#endif
diff --git gcc/testsuite/c-c++-common/cpp/openacc-define-3.c 
gcc/testsuite/c-c++-common/cpp/openacc-define-3.c
index ce270c3..ccedcd9 100644
--- gcc/testsuite/c-c++-common/cpp/openacc-define-3.c
+++ gcc/testsuite/c-c++-common/cpp/openacc-define-3.c
@@ -1,3 +1,11 @@
 /* { dg-options -fopenacc } */
 /* { dg-do preprocess } */
 /* { dg-require-effective-target fopenacc } */
+
+#ifndef _OPENACC
+# error _OPENACC not defined
+#endif
+
+#if _OPENACC != 201306
+# error _OPENACC defined to wrong value
+#endif
diff --git gcc/testsuite/gfortran.dg/openacc-define-1.f90 
gcc/testsuite/gfortran.dg/openacc-define-1.f90
index b961468..42f4073 100644
--- gcc/testsuite/gfortran.dg/openacc-define-1.f90
+++ gcc/testsuite/gfortran.dg/openacc-define-1.f90
@@ -1,3 +1,7 @@
 ! { dg-options -cpp }
 ! { dg-do preprocess }
 ! { dg-require-effective-target fopenacc }
+
+#ifdef _OPENACC
+# error _OPENACC defined
+#endif
diff --git gcc/testsuite/gfortran.dg/openacc-define-2.f90 
gcc/testsuite/gfortran.dg/openacc-define-2.f90
index 49b714d..8ad1bd5 100644
--- gcc/testsuite/gfortran.dg/openacc-define-2.f90
+++ gcc/testsuite/gfortran.dg/openacc-define-2.f90
@@ -1,3 +1,7 @@
 ! { dg-options -cpp -fno-openacc }
 ! { dg-do preprocess }
 ! { dg-require-effective-target fopenacc }
+
+#ifdef _OPENACC
+# error _OPENACC defined
+#endif
diff --git gcc/testsuite/gfortran.dg/openacc-define-3.f90 

[gomp4 7/9] OpenACC: Use OpenMP's lowering and expansion passes.

2013-11-06 Thread thomas
From: Thomas Schwinge tho...@codesourcery.com

gcc/
* gimplify.c (gimplify_body): Consider flag_openacc additionally
to flag_openmp.
* omp-low.c (execute_expand_omp, execute_lower_omp)
(gate_diagnose_omp_blocks): Likewise.
gcc/testsuite/
* gcc.dg/goacc-gomp/goacc-gomp.exp: New file.
* gcc.dg/goacc/goacc.exp: Likewise.
---
 gcc/gimplify.c |  4 +--
 gcc/omp-low.c  | 10 ---
 gcc/testsuite/gcc.dg/goacc-gomp/goacc-gomp.exp | 38 ++
 gcc/testsuite/gcc.dg/goacc/goacc.exp   | 37 +
 4 files changed, 83 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/goacc-gomp/goacc-gomp.exp
 create mode 100644 gcc/testsuite/gcc.dg/goacc/goacc.exp

diff --git gcc/gimplify.c gcc/gimplify.c
index 1f18466..30c2b45 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -8803,7 +8803,7 @@ gimplify_body (tree fndecl, bool do_parms)
   gcc_assert (gimplify_ctxp == NULL);
   push_gimplify_context (gctx);
 
-  if (flag_openmp)
+  if (flag_openacc || flag_openmp)
 {
   gcc_assert (gimplify_omp_ctxp == NULL);
   if (lookup_attribute (omp declare target, DECL_ATTRIBUTES (fndecl)))
@@ -8872,7 +8872,7 @@ gimplify_body (tree fndecl, bool do_parms)
   nonlocal_vlas = NULL;
 }
 
-  if (flag_openmp  gimplify_omp_ctxp)
+  if ((flag_openacc || flag_openmp)  gimplify_omp_ctxp)
 {
   delete_omp_context (gimplify_omp_ctxp);
   gimplify_omp_ctxp = NULL;
diff --git gcc/omp-low.c gcc/omp-low.c
index 94058af..99811d0 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -8234,7 +8234,8 @@ execute_expand_omp (void)
 static bool
 gate_expand_omp (void)
 {
-  return (flag_openmp != 0  !seen_error ());
+  return ((flag_openacc || flag_openmp)
+  !seen_error ());
 }
 
 namespace {
@@ -10054,8 +10055,9 @@ execute_lower_omp (void)
   gimple_seq body;
 
   /* This pass always runs, to provide PROP_gimple_lomp.
- But there is nothing to do unless -fopenmp is given.  */
-  if (flag_openmp == 0)
+ But there is nothing to do unless at least one of -fopenacc or -fopenmp is
+ given.  */
+  if (!(flag_openacc || flag_openmp))
 return 0;
 
   all_contexts = splay_tree_new (splay_tree_compare_pointers, 0,
@@ -10484,7 +10486,7 @@ diagnose_omp_structured_block_errors (void)
 static bool
 gate_diagnose_omp_blocks (void)
 {
-  return flag_openmp != 0;
+  return flag_openacc || flag_openmp;
 }
 
 namespace {
diff --git gcc/testsuite/gcc.dg/goacc-gomp/goacc-gomp.exp 
gcc/testsuite/gcc.dg/goacc-gomp/goacc-gomp.exp
new file mode 100644
index 000..29e9a93
--- /dev/null
+++ gcc/testsuite/gcc.dg/goacc-gomp/goacc-gomp.exp
@@ -0,0 +1,38 @@
+# Copyright (C) 2006-2013 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# http://www.gnu.org/licenses/.
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+if { ![check_effective_target_fopenacc] \
+ || ![check_effective_target_fopenmp] } {
+  return
+}
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [concat \
+   [find $srcdir/$subdir *.c] \
+   [find $srcdir/c-c++-common/goacc-gomp *.c]]]  -fopenacc -fopenmp
+
+# All done.
+dg-finish
diff --git gcc/testsuite/gcc.dg/goacc/goacc.exp 
gcc/testsuite/gcc.dg/goacc/goacc.exp
new file mode 100644
index 000..1137c99
--- /dev/null
+++ gcc/testsuite/gcc.dg/goacc/goacc.exp
@@ -0,0 +1,37 @@
+# Copyright (C) 2006-2013 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# http://www.gnu.org/licenses/.
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+if ![check_effective_target_fopenacc] {
+  return
+}
+
+# Initialize `dg'.
+dg-init
+

[gomp4 3/9] OpenACC: Recognize -fopenacc.

2013-11-06 Thread thomas
From: Thomas Schwinge tho...@codesourcery.com

gcc/c-family/
* c.opt (fopenacc): New option.
gcc/fortran/
* lang.opt (fopenacc): New option.
* invoke.texi (-fopenacc): Document it.
* gfortran.h (gfc_option_t): New member.
* options.c (gfc_init_options, gfc_handle_option): Handle it.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_fopenacc): New
procedure.
gcc/
* doc/invoke.texi (-fopenacc): Document it.
* doc/sourcebuild.texi (fopenacc): Document it.
gcc/testsuite/
* c-c++-common/cpp/openacc-define-1.c: New file.
* c-c++-common/cpp/openacc-define-2.c: Likewise.
* c-c++-common/cpp/openacc-define-3.c: Likewise.
* gfortran.dg/openacc-define-1.f90: Likewise.
* gfortran.dg/openacc-define-2.f90: Likewise.
* gfortran.dg/openacc-define-3.f90: Likewise.
---
 gcc/c-family/c.opt|  4 
 gcc/doc/invoke.texi   | 11 ++-
 gcc/doc/sourcebuild.texi  |  3 +++
 gcc/fortran/gfortran.h|  1 +
 gcc/fortran/invoke.texi   |  7 ++-
 gcc/fortran/lang.opt  |  4 
 gcc/fortran/options.c |  5 +
 gcc/testsuite/c-c++-common/cpp/openacc-define-1.c |  2 ++
 gcc/testsuite/c-c++-common/cpp/openacc-define-2.c |  3 +++
 gcc/testsuite/c-c++-common/cpp/openacc-define-3.c |  3 +++
 gcc/testsuite/gfortran.dg/openacc-define-1.f90|  3 +++
 gcc/testsuite/gfortran.dg/openacc-define-2.f90|  3 +++
 gcc/testsuite/gfortran.dg/openacc-define-3.f90|  3 +++
 gcc/testsuite/lib/target-supports.exp |  9 +
 14 files changed, 59 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/openacc-define-1.c
 create mode 100644 gcc/testsuite/c-c++-common/cpp/openacc-define-2.c
 create mode 100644 gcc/testsuite/c-c++-common/cpp/openacc-define-3.c
 create mode 100644 gcc/testsuite/gfortran.dg/openacc-define-1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/openacc-define-2.f90
 create mode 100644 gcc/testsuite/gfortran.dg/openacc-define-3.f90

diff --git gcc/c-family/c.opt gcc/c-family/c.opt
index b862eb9..d86d79b 100644
--- gcc/c-family/c.opt
+++ gcc/c-family/c.opt
@@ -1065,6 +1065,10 @@ fobjc-std=objc1
 ObjC ObjC++ Var(flag_objc1_only)
 Conform to the Objective-C 1.0 language as implemented in GCC 4.0
 
+fopenacc
+C ObjC C++ ObjC++ Var(flag_openacc)
+Enable OpenACC
+
 fopenmp
 C ObjC C++ ObjC++ Var(flag_openmp)
 Enable OpenMP (implies -frecursive in Fortran)
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index e84bca3..e393139 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -168,7 +168,8 @@ in the following sections.
 @gccoptlist{-ansi  -std=@var{standard}  -fgnu89-inline @gol
 -aux-info @var{filename} -fallow-parameterless-variadic-functions @gol
 -fno-asm  -fno-builtin  -fno-builtin-@var{function} @gol
--fhosted  -ffreestanding -fopenmp -fms-extensions -fplan9-extensions @gol
+-fhosted  -ffreestanding -fopenacc -fopenmp -fms-extensions @gol
+-fplan9-extensions @gol
 -trigraphs  -traditional  -traditional-cpp @gol
 -fallow-single-precision  -fcond-mismatch -flax-vector-conversions @gol
 -fsigned-bitfields  -fsigned-char @gol
@@ -1831,6 +1832,14 @@ This is equivalent to @option{-fno-hosted}.
 @xref{Standards,,Language Standards Supported by GCC}, for details of
 freestanding and hosted environments.
 
+@item -fopenacc
+@opindex fopenacc
+@cindex OpenACC accelerator programming
+Enable handling of OpenACC.
+When @option{-fopenacc} is specified, the
+compiler generates accelerated code according to the OpenACC Application
+Programming Interface v2.0 @w{@uref{http://www.openacc.org/}}.
+
 @item -fopenmp
 @opindex fopenmp
 @cindex OpenMP parallel
diff --git gcc/doc/sourcebuild.texi gcc/doc/sourcebuild.texi
index 1a70916..8b0031c 100644
--- gcc/doc/sourcebuild.texi
+++ gcc/doc/sourcebuild.texi
@@ -1787,6 +1787,9 @@ Target supports Graphite optimizations.
 @item fixed_point
 Target supports fixed-point extension to C.
 
+@item fopenacc
+Target supports OpenACC via @option{-fopenacc}.
+
 @item fopenmp
 Target supports OpenMP via @option{-fopenmp}.
 
diff --git gcc/fortran/gfortran.h gcc/fortran/gfortran.h
index b28edd8..5089691 100644
--- gcc/fortran/gfortran.h
+++ gcc/fortran/gfortran.h
@@ -2285,6 +2285,7 @@ typedef struct
   int blas_matmul_limit;
   int flag_cray_pointer;
   int flag_d_lines;
+  int gfc_flag_openacc;
   int gfc_flag_openmp;
   int flag_sign_zero;
   int flag_stack_arrays;
diff --git gcc/fortran/invoke.texi gcc/fortran/invoke.texi
index eb678d1..46fca59 100644
--- gcc/fortran/invoke.texi
+++ gcc/fortran/invoke.texi
@@ -120,7 +120,7 @@ by type.  Explanations are in the following sections.
 -ffixed-line-length-none -ffree-form -ffree-line-length-@var{n} @gol
 -ffree-line-length-none -fimplicit-none 

[PATCH, i386]: Fix PR 59021, new vzeroupper instructions generated with -mavx

2013-11-06 Thread Uros Bizjak
Hello!

Attached patch fixes PR 59021, where new vzeroupper instructions are
generated for -mavx after Vlad's useless insn removal patch.

The problem was, that we depent on (useless) moves to AVX256 function
argument registers in front of the function call to switch the mode to
DIRTY mode. This is not true anymore, so call_insn has to switch to
DIRTY mode by itself.

While looking at this area - always set the mode after the call_insn
in MODE_AFTER function.

2013-11-06  Uros Bizjak  ubiz...@gmail.com

PR target/59021
* config/i386/i386.c (ix86_avx_u128_mode_needed): Require
AVX_U128_DIRTY mode for call_insn RTXes that use AVX256 registers.
(ix86_avx_u128_mode_needed): Return AVX_U128_DIRTY mode for call_insn
RTXes that return in AVX256 register.

testsuite/ChangeLog:

2013-11-06  Uros Bizjak  ubiz...@gmail.com

PR target/59021
* gcc.target/i386/pr59021.c: New test.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32}, configured with --with-arch=core-avx-i --with-cpu=core-avx-i
--with-fpmath=avx.

The patch will be backported to 4.7 and 4.8 branch in a couple of days.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 204459)
+++ config/i386/i386.c  (working copy)
@@ -15708,7 +15708,7 @@ ix86_avx_u128_mode_needed (rtx insn)
  rtx arg = XEXP (XEXP (link, 0), 0);
 
  if (ix86_check_avx256_register (arg, NULL))
-   return AVX_U128_ANY;
+   return AVX_U128_DIRTY;
}
}
 
@@ -15828,8 +15828,8 @@ ix86_avx_u128_mode_after (int mode, rtx insn)
 {
   bool avx_reg256_found = false;
   note_stores (pat, ix86_check_avx256_stores, avx_reg256_found);
-  if (!avx_reg256_found)
-   return AVX_U128_CLEAN;
+
+  return avx_reg256_found ? AVX_U128_DIRTY : AVX_U128_CLEAN;
 }
 
   /* Otherwise, return current mode.  Remember that if insn
Index: testsuite/gcc.target/i386/pr59021.c
===
--- testsuite/gcc.target/i386/pr59021.c (revision 0)
+++ testsuite/gcc.target/i386/pr59021.c (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -mavx -mvzeroupper } */
+
+extern void abort (void);
+
+struct S {
+  int i1;
+  int i2;
+  int i3;
+};
+
+typedef double v4df  __attribute__ ((vector_size (32)));
+
+extern int foo (v4df, int i1, int i2, int i3, int i4, int i5, struct S s);
+
+void bar (v4df v, struct S s)
+{
+  int r = foo (v, 1, 2, 3, 4, 5, s);
+  if (r)
+abort ();
+}
+
+/* { dg-final { scan-assembler-not vzeroupper } } */


Committed patch implementing live range shrinkage.

2013-11-06 Thread Vladimir Makarov

  Here is the final version of the patch I committed as rev.204465.

  It takes Jeff's proposal into account.

  The patch was successfully bootstrapped with -flive-range-shrinkage 
on x86 and x86-64.


2013-11-06  Vladimir Makarov  vmaka...@redhat.com

* tree-pass.h (make_pass_live_range_shrinkage): New external.
* timevar.def (TV_LIVE_RANGE_SHRINKAGE): New.
* sched-rgn.c (gate_handle_live_range_shrinkage): New.
(rest_of_handle_live_range_shrinkage): Ditto
(class pass_live_range_shrinkage): Ditto.
(pass_data_live_range_shrinkage): Ditto.
(make_pass_live_range_shrinkage): Ditto.
* sched-int.h (initialize_live_range_shrinkage): New prototype.
(finish_live_range_shrinkage): Ditto.
* sched-deps.c (create_insn_reg_set): Make void return value.
* passes.def: Add pass_live_range_shrinkage.
* ira.c (update_equiv_regs): Don't move if
flag_live_range_shrinkage.
* haifa-sched.c (live_range_shrinkage_p): New.
(initialize_live_range_shrinkage, finish_live_range_shrinkage):
New functions.
(rank_for_schedule): Add code for pressure relief through live
range shrinkage.
(schedule_insn): Print more debug info.
(sched_init): Setup SCHED_PRESSURE_WEIGHTED for pressure relief
through live range shrinkage.
* doc/invoke.texi (-flive-range-shrinkage): New.
* common.opt (flive-range-shrinkage): New.

Index: common.opt
===
--- common.opt  (revision 204380)
+++ common.opt  (working copy)
@@ -1738,6 +1738,10 @@ fregmove
 Common Ignore
 Does nothing. Preserved for backward compatibility.
 
+flive-range-shrinkage
+Common Report Var(flag_live_range_shrinkage) Init(0) Optimization
+Relief of register pressure through live range shrinkage
+
 frename-registers
 Common Report Var(flag_rename_registers) Init(2) Optimization
 Perform a register renaming optimization pass
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 204216)
+++ doc/invoke.texi (working copy)
@@ -378,7 +378,7 @@ Objective-C and Objective-C++ Dialects}.
 -fira-region=@var{region} -fira-hoist-pressure @gol
 -fira-loop-pressure -fno-ira-share-save-slots @gol
 -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
--fivopts -fkeep-inline-functions -fkeep-static-consts @gol
+-fivopts -fkeep-inline-functions -fkeep-static-consts -flive-range-shrinkage 
@gol
 -floop-block -floop-interchange -floop-strip-mine -floop-nest-optimize @gol
 -floop-parallelize-all -flto -flto-compression-level @gol
 -flto-partition=@var{alg} -flto-report -flto-report-wpa -fmerge-all-constants 
@gol
@@ -7257,6 +7257,12 @@ registers after writing to their lower 3
 
 Enabled for x86 at levels @option{-O2}, @option{-O3}.
 
+@item -flive-range-shrinkage
+@opindex flive-range-shrinkage
+Attempt to decrease register pressure through register live range
+shrinkage.  This is helpful for fast processors with small or moderate
+size register sets.
+
 @item -fira-algorithm=@var{algorithm}
 Use the specified coloring algorithm for the integrated register
 allocator.  The @var{algorithm} argument can be @samp{priority}, which
Index: haifa-sched.c
===
--- haifa-sched.c   (revision 204380)
+++ haifa-sched.c   (working copy)
@@ -150,6 +150,24 @@ along with GCC; see the file COPYING3.
 
 #ifdef INSN_SCHEDULING
 
+/* True if we do register pressure relief through live-range
+   shrinkage.  */
+static bool live_range_shrinkage_p;
+
+/* Switch on live range shrinkage.  */
+void
+initialize_live_range_shrinkage (void)
+{
+  live_range_shrinkage_p = true;
+}
+
+/* Switch off live range shrinkage.  */
+void
+finish_live_range_shrinkage (void)
+{
+  live_range_shrinkage_p = false;
+}
+
 /* issue_rate is the number of insns that can be scheduled in the same
machine cycle.  It can be defined in the config/mach/mach.h file,
otherwise we set it to 1.  */
@@ -2519,7 +2537,7 @@ rank_for_schedule (const void *x, const
   rtx tmp = *(const rtx *) y;
   rtx tmp2 = *(const rtx *) x;
   int tmp_class, tmp2_class;
-  int val, priority_val, info_val;
+  int val, priority_val, info_val, diff;
 
   if (MAY_HAVE_DEBUG_INSNS)
 {
@@ -2532,6 +2550,22 @@ rank_for_schedule (const void *x, const
return INSN_LUID (tmp) - INSN_LUID (tmp2);
 }
 
+  if (live_range_shrinkage_p)
+{
+  /* Don't use SCHED_PRESSURE_MODEL -- it results in much worse
+code.  */
+  gcc_assert (sched_pressure == SCHED_PRESSURE_WEIGHTED);
+  if ((INSN_REG_PRESSURE_EXCESS_COST_CHANGE (tmp)  0
+  || INSN_REG_PRESSURE_EXCESS_COST_CHANGE (tmp2)  0)
+  (diff = (INSN_REG_PRESSURE_EXCESS_COST_CHANGE (tmp)
+ - INSN_REG_PRESSURE_EXCESS_COST_CHANGE (tmp2))) != 0)
+   return diff;
+  /* Sort by INSN_LUID 

Re: [gomp4 9/9] OpenACC: Basic support for #pragma acc parallel.

2013-11-06 Thread Jakub Jelinek
On Wed, Nov 06, 2013 at 08:42:23PM +0100, tho...@codesourcery.com wrote:
 +#define OACC_PARALLEL_CLAUSE_MASK\
 + PRAGMA_OMP_CLAUSE_NONE

This is incorrect, either you should define in c-common.h
also OMP_CLAUSE_MASK_0 and define this to it, or
it needs to be (OMP_CLAUSE_MASK_1  PRAGMA_OMP_CLAUSE_NONE).
Otherwise it won't compile on 32-bit HWI targets where omp_clause_mask
is a C++ class.

Jakub


[gomp4 8/9] OpenACC: Basic support for #pragma acc in the C front end.

2013-11-06 Thread thomas
From: Thomas Schwinge tho...@codesourcery.com

gcc/c-family/
* c-pragma.c (oacc_pragmas): New array.
(c_pp_lookup_pragma, init_pragma): Handle it.
gcc/
* doc/invoke.texi (-fopenacc): Update.

gcc/c/
* c-parser.c (c_parser_omp_all_clauses): Make a parser error
message suitable for OpenACC, too.
gcc/cp/
* parser.c (cp_parser_omp_all_clauses): Make a parser error
message suitable for OpenACC, too.
---
 gcc/c-family/c-pragma.c | 22 ++
 gcc/c/c-parser.c|  2 +-
 gcc/cp/parser.c |  2 +-
 gcc/doc/invoke.texi |  2 +-
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git gcc/c-family/c-pragma.c gcc/c-family/c-pragma.c
index 3ce77a2..98f98d0 100644
--- gcc/c-family/c-pragma.c
+++ gcc/c-family/c-pragma.c
@@ -1164,6 +1164,8 @@ typedef struct
 static vecpragma_ns_name registered_pp_pragmas;
 
 struct omp_pragma_def { const char *name; unsigned int id; };
+static const struct omp_pragma_def oacc_pragmas[] = {
+};
 static const struct omp_pragma_def omp_pragmas[] = {
   { atomic, PRAGMA_OMP_ATOMIC },
   { barrier, PRAGMA_OMP_BARRIER },
@@ -1194,9 +1196,18 @@ static const struct omp_pragma_def omp_pragmas[] = {
 void
 c_pp_lookup_pragma (unsigned int id, const char **space, const char **name)
 {
+  const int n_oacc_pragmas = sizeof (oacc_pragmas) / sizeof (*oacc_pragmas);
   const int n_omp_pragmas = sizeof (omp_pragmas) / sizeof (*omp_pragmas);
   int i;
 
+  for (i = 0; i  n_oacc_pragmas; ++i)
+if (oacc_pragmas[i].id == id)
+  {
+   *space = acc;
+   *name = oacc_pragmas[i].name;
+   return;
+  }
+
   for (i = 0; i  n_omp_pragmas; ++i)
 if (omp_pragmas[i].id == id)
   {
@@ -1348,6 +1359,17 @@ c_invoke_pragma_handler (unsigned int id)
 void
 init_pragma (void)
 {
+  if (flag_openacc)
+{
+  const int n_oacc_pragmas
+   = sizeof (oacc_pragmas) / sizeof (*oacc_pragmas);
+  int i;
+
+  for (i = 0; i  n_oacc_pragmas; ++i)
+   cpp_register_deferred_pragma (parse_in, acc, oacc_pragmas[i].name,
+ oacc_pragmas[i].id, true, true);
+}
+
   if (flag_openmp)
 {
   const int n_omp_pragmas = sizeof (omp_pragmas) / sizeof (*omp_pragmas);
diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index a8f4774..8a1e988 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -10730,7 +10730,7 @@ c_parser_omp_all_clauses (c_parser *parser, 
omp_clause_mask mask,
  c_name = simdlen;
  break;
default:
- c_parser_error (parser, expected %#pragma omp% clause);
+ c_parser_error (parser, expected clause);
  goto saw_error;
}
 
diff --git gcc/cp/parser.c gcc/cp/parser.c
index bbc8e75..c3345ee 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -27911,7 +27911,7 @@ cp_parser_omp_all_clauses (cp_parser *parser, 
omp_clause_mask mask,
  c_name = simdlen;
  break;
default:
- cp_parser_error (parser, expected %#pragma omp% clause);
+ cp_parser_error (parser, expected clause);
  goto saw_error;
}
 
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index af8973a..cc4a6da 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -1835,7 +1835,7 @@ freestanding and hosted environments.
 @item -fopenacc
 @opindex fopenacc
 @cindex OpenACC accelerator programming
-Enable handling of OpenACC.
+Enable handling of OpenACC directives @code{#pragma acc} in C.
 When @option{-fopenacc} is specified, the
 compiler generates accelerated code according to the OpenACC Application
 Programming Interface v2.0 @w{@uref{http://www.openacc.org/}}.  This option
-- 
1.8.1.1



[gomp4 6/9] OpenACC: Infrastructure for builtins.

2013-11-06 Thread thomas
From: Thomas Schwinge tho...@codesourcery.com

gcc/
* oacc-builtins.def: New file.
* Makefile.in (BUILTINS_DEF): Add oacc-builtins.def.
* builtins.def (DEF_GOACC_BUILTIN): New macro.
Include oacc-builtins.def.
gcc/fortran/
* f95-lang.c (DEF_GOACC_BUILTIN): New macro.
Include ../oacc-builtins.def.
libgomp/
* libgomp.map (GOACC_2.0): New symbol version.
---
 gcc/Makefile.in|  3 ++-
 gcc/builtins.def   | 10 ++
 gcc/fortran/f95-lang.c | 10 ++
 gcc/oacc-builtins.def  | 28 
 libgomp/libgomp.map|  3 +++
 5 files changed, 53 insertions(+), 1 deletion(-)
 create mode 100644 gcc/oacc-builtins.def

diff --git gcc/Makefile.in gcc/Makefile.in
index cc88fb8..0511097 100644
--- gcc/Makefile.in
+++ gcc/Makefile.in
@@ -871,7 +871,8 @@ FIXED_VALUE_H = fixed-value.h $(MACHMODE_H) double-int.h
 RTL_H = $(RTL_BASE_H) $(FLAGS_H) genrtl.h
 READ_MD_H = $(OBSTACK_H) $(HASHTAB_H) read-md.h
 PARAMS_H = params.h params.def
-BUILTINS_DEF = builtins.def sync-builtins.def omp-builtins.def \
+BUILTINS_DEF = builtins.def sync-builtins.def \
+   oacc-builtins.def omp-builtins.def \
gtm-builtins.def sanitizer.def cilkplus.def cilk-builtins.def
 INTERNAL_FN_DEF = internal-fn.def
 INTERNAL_FN_H = internal-fn.h $(INTERNAL_FN_DEF)
diff --git gcc/builtins.def gcc/builtins.def
index e2d8849..9a9a20a 100644
--- gcc/builtins.def
+++ gcc/builtins.def
@@ -139,6 +139,13 @@ along with GCC; see the file COPYING3.  If not see
   DEF_BUILTIN (ENUM, NAME, BUILT_IN_NORMAL, BT_LAST, BT_LAST, false, false, \
   false, ATTR_LAST, false, false)
 
+/* Builtin used by the implementation of GNU OpenACC.  None of these are
+   actually implemented in the compiler; they're all in libgomp.  */
+#undef DEF_GOACC_BUILTIN
+#define DEF_GOACC_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
+  DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
+   false, true, true, ATTRS, false, flag_openacc)
+
 /* Builtin used by the implementation of GNU OpenMP.  None of these are
actually implemented in the compiler; they're all in libgomp.  */
 #undef DEF_GOMP_BUILTIN
@@ -856,6 +863,9 @@ DEF_GCC_BUILTIN (BUILT_IN_LINE, LINE, BT_FN_INT, 
ATTR_NOTHROW_LEAF_LIST)
 /* Synchronization Primitives.  */
 #include sync-builtins.def
 
+/* OpenACC builtins.  */
+#include oacc-builtins.def
+
 /* OpenMP builtins.  */
 #include omp-builtins.def
 
diff --git gcc/fortran/f95-lang.c gcc/fortran/f95-lang.c
index 873c137..69012b6 100644
--- gcc/fortran/f95-lang.c
+++ gcc/fortran/f95-lang.c
@@ -1035,6 +1035,16 @@ gfc_init_builtin_functions (void)
 #include ../sync-builtins.def
 #undef DEF_SYNC_BUILTIN
 
+  if (gfc_option.gfc_flag_openacc)
+{
+#undef DEF_GOACC_BUILTIN
+#define DEF_GOACC_BUILTIN(code, name, type, attr) \
+  gfc_define_builtin (__builtin_ name, builtin_types[type], \
+ code, name, attr);
+#include ../oacc-builtins.def
+#undef DEF_GOACC_BUILTIN
+}
+
   if (gfc_option.gfc_flag_openmp || flag_tree_parallelize_loops)
 {
 #undef DEF_GOMP_BUILTIN
diff --git gcc/oacc-builtins.def gcc/oacc-builtins.def
new file mode 100644
index 000..fd630e0
--- /dev/null
+++ gcc/oacc-builtins.def
@@ -0,0 +1,28 @@
+/* This file contains the definitions and documentation for the
+   OpenACC builtins used in the GNU compiler.
+
+   Copyright (C) 2013 Free Software Foundation, Inc.
+
+   Contributed by Thomas Schwinge tho...@codesourcery.com.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+http://www.gnu.org/licenses/.  */
+
+/* Before including this file, you should define a macro:
+
+ DEF_GOACC_BUILTIN (ENUM, NAME, TYPE, ATTRS)
+
+   See builtins.def for details.  */
diff --git libgomp/libgomp.map libgomp/libgomp.map
index 4f87d00..f094ed2 100644
--- libgomp/libgomp.map
+++ libgomp/libgomp.map
@@ -230,3 +230,6 @@ GOMP_4.0 {
 
 OACC_2.0 {
 };
+
+GOACC_2.0 {
+};
-- 
1.8.1.1



[gomp4 4/9] OpenACC: The runtime library will be implemented in libgomp, too.

2013-11-06 Thread thomas
From: Thomas Schwinge tho...@codesourcery.com

gcc/
* gcc.c (LINK_COMMAND_SPEC, GOMP_SELF_SPECS): For -fopenacc, link
to libgomp and its dependencies.
* config/arc/arc.h (LINK_COMMAND_SPEC): Likewise.
* config/darwin.h (LINK_COMMAND_SPEC_A): Likewise.
* config/i386/mingw32.h (GOMP_SELF_SPECS): Likewise.
* config/ia64/hpux.h (LIB_SPEC): Likewise.
* config/pa/pa-hpux11.h (LIB_SPEC): Likewise.
* config/pa/pa64-hpux.h (LIB_SPEC): Likewise.
* doc/invoke.texi (-fopenacc): Update.
libgomp/
* libgomp.map (OACC_2.0): New symbol version.
* libgomp.spec.in: Update comment.
* configure.ac: Likewise.
* configure: Regenerate.
* Makefile.am (nodist_libsubinclude_HEADERS): Add openacc.h.
(nodist_finclude_HEADERS): Add openacc_lib.h, openacc.f90,
openacc.mod, and openacc_kinds.mod.
(openacc_kinds.mod): New target.
(%.mod): New target, generalized from omp_lib.mod.
* Makefile.in: Regenerate.
* openacc.f90: New file.
* openacc.h: Likewise.
* openacc_lib.h: Likewise.
* testsuite/libgomp.oacc-c++/c++.exp: Likewise.
* testsuite/libgomp.oacc-c/c.exp: Likewise.
* testsuite/libgomp.oacc-c/lib-1.c: Likewise.
* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
* testsuite/libgomp.oacc-fortran/lib-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/lib-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/lib-3.f: Likewise.
---
 gcc/config/arc/arc.h   |  2 +-
 gcc/config/darwin.h|  2 +-
 gcc/config/i386/mingw32.h  |  2 +-
 gcc/config/ia64/hpux.h |  2 +-
 gcc/config/pa/pa-hpux11.h  |  2 +-
 gcc/config/pa/pa64-hpux.h  | 12 ++--
 gcc/doc/invoke.texi|  4 +-
 gcc/gcc.c  |  7 ++-
 libgomp/Makefile.am|  9 ++-
 libgomp/Makefile.in| 10 +++-
 libgomp/configure  |  2 +-
 libgomp/configure.ac   |  2 +-
 libgomp/libgomp.map|  3 +
 libgomp/libgomp.spec.in|  2 +-
 libgomp/openacc.f90| 37 
 libgomp/openacc.h  | 45 ++
 libgomp/openacc_lib.h  | 26 
 libgomp/testsuite/libgomp.oacc-c++/c++.exp | 66 +
 libgomp/testsuite/libgomp.oacc-c/c.exp | 36 +++
 libgomp/testsuite/libgomp.oacc-c/lib-1.c   |  7 +++
 libgomp/testsuite/libgomp.oacc-fortran/fortran.exp | 69 ++
 libgomp/testsuite/libgomp.oacc-fortran/lib-1.f90   |  3 +
 libgomp/testsuite/libgomp.oacc-fortran/lib-2.f |  3 +
 libgomp/testsuite/libgomp.oacc-fortran/lib-3.f |  3 +
 24 files changed, 332 insertions(+), 24 deletions(-)
 create mode 100644 libgomp/openacc.f90
 create mode 100644 libgomp/openacc.h
 create mode 100644 libgomp/openacc_lib.h
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/c++.exp
 create mode 100644 libgomp/testsuite/libgomp.oacc-c/c.exp
 create mode 100644 libgomp/testsuite/libgomp.oacc-c/lib-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/lib-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/lib-2.f
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/lib-3.f

diff --git gcc/config/arc/arc.h gcc/config/arc/arc.h
index 637f7b6..14fc717 100644
--- gcc/config/arc/arc.h
+++ gcc/config/arc/arc.h
@@ -174,7 +174,7 @@ along with GCC; see the file COPYING3.  If not see
 %(linker) %l  LINK_PIE_SPEC %X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} %{r}\
 %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
 %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
-%{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
+%{fopenacc|fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\
 %{fprofile-arcs|fprofile-generate|coverage:-lgcov}\
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
 %{!A:%{!nostdlib:%{!nostartfiles:%E}}} %{T*} }}
diff --git gcc/config/darwin.h gcc/config/darwin.h
index 596c9ef..735b6b9 100644
--- gcc/config/darwin.h
+++ gcc/config/darwin.h
@@ -176,7 +176,7 @@ extern GTY(()) int darwin_ms_struct;
 %{o*}%{!o:-o a.out} \
 %{!nostdlib:%{!nostartfiles:%S}} \
 %{L*} %(link_libgcc) %o 
%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \
-%{fopenmp|ftree-parallelize-loops=*: \
+%{fopenacc|fopenmp|ftree-parallelize-loops=*: \
   %{static|static-libgcc|static-libstdc++|static-libgfortran: libgomp.a%s; 
: -lgomp } } \
 %{fgnu-tm: \
 

Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Tom Tromey
 Jeff == Jeff Law l...@redhat.com writes:

Jeff ISTM that one liner belongs in GCC's .gdbinit.  Until then, I'm adding
Jeff it to my own :-)

Yeah, I think that would be reasonable.
It seems like it isn't appropriate in many cases, so we left it off by
default.

Tom


Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Jeff Law

On 11/06/13 13:01, Tom Tromey wrote:

Jeff == Jeff Law l...@redhat.com writes:


Jeff ISTM that one liner belongs in GCC's .gdbinit.  Until then, I'm adding
Jeff it to my own :-)

Yeah, I think that would be reasonable.
It seems like it isn't appropriate in many cases, so we left it off by
default.

Here's the patch I installed for GCC.   Thanks for the suggestion!

Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index dfaf4e3..43ebbda 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2013-11-06  Jeff Law  l...@redhat.com
+   Tom Tromey  tro...@redhat.com
+
+   * gdbinit.in: Disable strict type checking.
+
 2013-11-06  Vladimir Makarov  vmaka...@redhat.com
 
* tree-pass.h (make_pass_live_range_shrinkage): New external.
diff --git a/gcc/gdbinit.in b/gcc/gdbinit.in
index 503ef24..c60cab1 100644
--- a/gcc/gdbinit.in
+++ b/gcc/gdbinit.in
@@ -205,6 +205,11 @@ set complaints 0
 b exit
 b abort
 
+# Disable strict type checking.  This allows developers to (for example)
+# make inferior calls without casting absolute address to a suitable
+# pointer type.
+set check type off
+
 # Skip all inline functions in tree.h.
 # These are used in accessor macros.
 # Note that this is added at the end because older gdb versions


PATCHES: [4.9 Regression] libsanitizer doesn't build for x32

2013-11-06 Thread H.J. Lu
These 4 patches fix libsanitizer build for x32.  They will be checked
into upstream:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59018#c4

In the meantime, OK to install to unblock x32 build?

Thanks.


H.J.
From de59975f158501bc99b345c02687944a49c9d31c Mon Sep 17 00:00:00 2001
From: H.J. Lu hjl.to...@gmail.com
Date: Tue, 5 Nov 2013 05:40:48 -0800
Subject: [PATCH 1/4] Cast pointers to uptr for 64-bit syscalls

---
 libsanitizer/ChangeLog.x32   | 20 +
 libsanitizer/sanitizer_common/sanitizer_linux.cc | 38 
 2 files changed, 39 insertions(+), 19 deletions(-)
 create mode 100644 libsanitizer/ChangeLog.x32

diff --git a/libsanitizer/ChangeLog.x32 b/libsanitizer/ChangeLog.x32
new file mode 100644
index 000..7b54005
--- /dev/null
+++ b/libsanitizer/ChangeLog.x32
@@ -0,0 +1,20 @@
+2013-11-05  H.J. Lu  hongjiu...@intel.com
+
+   * sanitizer_common/sanitizer_linux.cc (internal_mmap): Cast
+   pointers to uptr for 64-bit syscalls.
+   (internal_munmap): Likewise.
+   (internal_open): Likewise.
+   (internal_read): Likewise.
+   (internal_write): Likewise.
+   (internal_stat): Likewise.
+   (internal_lstat): Likewise.
+   (internal_fstat): Likewise.
+   (internal_readlink): Likewise.
+   (internal_unlink): Likewise.
+   (internal_execve): Likewise.
+   (NanoTime): Likewise.
+   (BlockingMutex::Lock): Likewise.
+   (BlockingMutex::Unlock): Likewise.
+   (internal_ptrace): Likewise.
+   (internal_getdents): Likewise.
+   (internal_sigaltstack): Likewise.
diff --git a/libsanitizer/sanitizer_common/sanitizer_linux.cc 
b/libsanitizer/sanitizer_common/sanitizer_linux.cc
index 666f15b..e48bee5 100644
--- a/libsanitizer/sanitizer_common/sanitizer_linux.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_linux.cc
@@ -77,14 +77,14 @@ namespace __sanitizer {
 uptr internal_mmap(void *addr, uptr length, int prot, int flags,
 int fd, u64 offset) {
 #if SANITIZER_LINUX_USES_64BIT_SYSCALLS
-  return internal_syscall(__NR_mmap, addr, length, prot, flags, fd, offset);
+  return internal_syscall(__NR_mmap, (uptr)addr, length, prot, flags, fd, 
offset);
 #else
   return internal_syscall(__NR_mmap2, addr, length, prot, flags, fd, offset);
 #endif
 }
 
 uptr internal_munmap(void *addr, uptr length) {
-  return internal_syscall(__NR_munmap, addr, length);
+  return internal_syscall(__NR_munmap, (uptr)addr, length);
 }
 
 uptr internal_close(fd_t fd) {
@@ -92,11 +92,11 @@ uptr internal_close(fd_t fd) {
 }
 
 uptr internal_open(const char *filename, int flags) {
-  return internal_syscall(__NR_open, filename, flags);
+  return internal_syscall(__NR_open, (uptr)filename, flags);
 }
 
 uptr internal_open(const char *filename, int flags, u32 mode) {
-  return internal_syscall(__NR_open, filename, flags, mode);
+  return internal_syscall(__NR_open, (uptr)filename, flags, mode);
 }
 
 uptr OpenFile(const char *filename, bool write) {
@@ -106,13 +106,13 @@ uptr OpenFile(const char *filename, bool write) {
 
 uptr internal_read(fd_t fd, void *buf, uptr count) {
   sptr res;
-  HANDLE_EINTR(res, (sptr)internal_syscall(__NR_read, fd, buf, count));
+  HANDLE_EINTR(res, (sptr)internal_syscall(__NR_read, fd, (uptr)buf, count));
   return res;
 }
 
 uptr internal_write(fd_t fd, const void *buf, uptr count) {
   sptr res;
-  HANDLE_EINTR(res, (sptr)internal_syscall(__NR_write, fd, buf, count));
+  HANDLE_EINTR(res, (sptr)internal_syscall(__NR_write, fd, (uptr)buf, count));
   return res;
 }
 
@@ -138,7 +138,7 @@ static void stat64_to_stat(struct stat64 *in, struct stat 
*out) {
 
 uptr internal_stat(const char *path, void *buf) {
 #if SANITIZER_LINUX_USES_64BIT_SYSCALLS
-  return internal_syscall(__NR_stat, path, buf);
+  return internal_syscall(__NR_stat, (uptr)path, (uptr)buf);
 #else
   struct stat64 buf64;
   int res = internal_syscall(__NR_stat64, path, buf64);
@@ -149,7 +149,7 @@ uptr internal_stat(const char *path, void *buf) {
 
 uptr internal_lstat(const char *path, void *buf) {
 #if SANITIZER_LINUX_USES_64BIT_SYSCALLS
-  return internal_syscall(__NR_lstat, path, buf);
+  return internal_syscall(__NR_lstat, (uptr)path, (uptr)buf);
 #else
   struct stat64 buf64;
   int res = internal_syscall(__NR_lstat64, path, buf64);
@@ -160,7 +160,7 @@ uptr internal_lstat(const char *path, void *buf) {
 
 uptr internal_fstat(fd_t fd, void *buf) {
 #if SANITIZER_LINUX_USES_64BIT_SYSCALLS
-  return internal_syscall(__NR_fstat, fd, buf);
+  return internal_syscall(__NR_fstat, fd, (uptr)buf);
 #else
   struct stat64 buf64;
   int res = internal_syscall(__NR_fstat64, fd, buf64);
@@ -181,11 +181,11 @@ uptr internal_dup2(int oldfd, int newfd) {
 }
 
 uptr internal_readlink(const char *path, char *buf, uptr bufsize) {
-  return internal_syscall(__NR_readlink, path, buf, bufsize);
+  return internal_syscall(__NR_readlink, (uptr)path, (uptr)buf, bufsize);
 }
 
 uptr internal_unlink(const char *path) {
-  return 

Re: [PATCH 0/6] Conversion of gimple types to C++ inheritance (v3)

2013-11-06 Thread Jeff Law

On 11/05/13 16:23, Ian Lance Taylor wrote:

On Tue, Nov 5, 2013 at 2:12 PM, Jeff Law l...@redhat.com wrote:


I can't speak for Andrew, but my experience with this kind of object type
casting in a large C++ project is that it's a red flag for a design problem.


I'm going to chime in to say that I think it's a pretty typical way to
represent a compiler IR in C++.  There is a base type that a lot of
code uses, but there is also a real type, and the way to get to that
real type is to use a cast.  We could do it all with virtual
functions, but those carry a different cost.  In effect, using virtual
functions increases the size of the code field from 16 bits to 64
bits.  It adds up.

Also this seems to be a pretty direct version of the data structures
we already have.
Certainly and to some degree these casts in one form or another are a 
necessity.  My concern is we use them as a crutch (as I've seen in other 
projects) and we end up not getting the level of compile time static 
type checking that we want.


If the bulk are temporary in nature and we as a project agree to look at 
the refactoring necessary to minimize the is-a/as-a uses to a few low 
level idioms, then I'll hesitatingly go along.  If that's the choice we 
make, then folks should know I'll probably look at each is-a/as-a in 
patches I review and ask for a justification for their use.



jeff


  1   2   >