Re: RFC Asan instrumentation control

2013-12-24 Thread Maxim Ostapenko


On 12/19/2013 04:27 PM, Jakub Jelinek wrote:

On Thu, Dec 19, 2013 at 04:02:47PM +0400, Maxim Ostapenko wrote:

Sorry, ChangeLog and patch, of course.
2013-12-19  Max Ostapenko  

* cfgexpand.c (expand_stack_vars): Optionally disable asan stack 
protection.

Too long lines in ChangeLog, wrap to 80 columns.


Thanks, fixed.


(expand_used_vars): Likewise.
(partition_stack_vars): Likewise.
* asan.c (asan_emit_stack_protection): Optionally disable after return 
stack usage.

Ditto.


Likewise.


(instrument_derefs): Optionally disable memory access instrumentation.
(instrument_builtin_call): Likewise.
(instrument_strlen_call): Likewise.
(asan_protect_global): Optionally disable global variables protection.
* doc/invoke.texi: Added doc for new options.
* params.def: Added new options.
* params.h: Likewise.

2013-12-19  Max Ostapenko  
* c-c++-common/asan/global-overflow-2.c: New test.

Missing vertical space between date/name/mail and first entry.


Done.


@@ -1003,7 +1004,7 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned 
int alignb,
str_cst = asan_pp_string (&asan_pp);
  
/* Emit the prologue sequence.  */

-  if (asan_frame_size > 32 && asan_frame_size <= 65536 && pbase)
+  if (asan_frame_size > 32 && asan_frame_size <= 65536 && pbase && 
ASAN_USE_AFTER_RETURN)
  {
use_after_return_class = floor_log2 (asan_frame_size - 1) - 5;
/* __asan_stack_malloc_N guarantees alignment

Please wrap this.


Done.


--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -798,7 +798,7 @@ partition_stack_vars (void)
 sizes, as the shorter vars wouldn't be adequately protected.
 Don't do that for "large" (unsupported) alignment objects,
 those aren't protected anyway.  */
- if ((flag_sanitize & SANITIZE_ADDRESS) && isize != jsize
+ if ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK  && isize != 
jsize

Replace the two spaces with just one.


Done.


--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10037,6 +10037,36 @@ The default choice depends on the target.
  Set the maximum number of existing candidates that will be considered when
  seeking a basis for a new straight-line strength reduction candidate.
  
+@item asan-globals

+Enable buffer overflow detection for global objects. This kind of protection
+is enabled by default if you are using @option{-fsanitize=address} option.
+To disable global objects protection use @option{--param asan-globals=0} 
option.

Too long lines (several times).


Done.


+To disable memory reads instructions protection use @option{--param 
asan-instrument-reads=0} option.
+
+@item asan-instrument-writes
+Enable buffer overflow detection for memory writes instructions. This kind of 
protection
+is enabled by default if you are using @option{-fsanitize=address} option.
+To disable memory writes instructions protection use @option{--param 
asan-instrument-writes=0} option.
+
+@item asan-memintrin
+Enable detection for builtin functions. This kind of protection

I think for docs it should be "built-in functions".


Done.


As for the tests, I'm afraid I don't like them at all.
If anything, it ought to be dg-do compile tests where you say
scan assembly or some dump, but having runtime testcases that
trigger undefined behavior that isn't detected by the instrumentation
library at all and expect them to "pass" is simply wrong.

Jakub

Got it, converted all tests except no-asan-stack.c, because i failed to 
discover how to grep for stack

instrumentation. Perhaps memcmp of random data is fine?
2013-12-24  Max Ostapenko  

	* cfgexpand.c (expand_stack_vars): Optionally disable 
	asan stack protection.
	(expand_used_vars): Likewise.
	(partition_stack_vars): Likewise.
	* asan.c (asan_emit_stack_protection): Optionally disable 
	after return stack usage.
	(instrument_derefs): Optionally disable memory 
	access instrumentation.
	(instrument_builtin_call): Likewise.
	(instrument_strlen_call): Likewise.
	(asan_protect_global): Optionally disable 
	global variables protection.
	* doc/invoke.texi: Added doc for new options.
	* params.def: Added new options.
	* params.h: Likewise.

2013-12-24  Max Ostapenko  

	* c-c++-common/asan/no-asan-globals.c: New test.
	* c-c++-common/asan/no-asan-stack.c: Likewise.
	* c-c++-common/asan/no-instrument-reads.c: Likewise.
	* c-c++-common/asan/no-instrument-writes.c: Likewise.
	* c-c++-common/asan/use-after-return-1.c: Likewise.
	* c-c++-common/asan/no-use-after-return.c: Likewise.


diff --git a/gcc/asan.c b/gcc/asan.c
index d4059d6..1d9d8ae 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-builder.h"
 #include "ubsan.h"
 #include "predict.h"
+#include "params.h"
 
 /* AddressSanitizer finds out-of-bounds and use-after-free bugs
with <2x slowdown on average.
@@ -1003,7 +1004,8 @@ asan_emit_stack_

Re: [committed] Fix MASK_{LOAD,STORE} caused ICE (PR tree-optimization/59523)

2013-12-24 Thread H.J. Lu
On Tue, Dec 17, 2013 at 1:37 PM, Jakub Jelinek  wrote:
> Hi!
>
> I forgot to update_stmt stmts I've changed.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> committed to trunk as obvious.
>
> 2013-12-17  Jakub Jelinek  
>
> PR tree-optimization/59523
> * tree-vectorizer.c (fold_loop_vectorized_call): Call update_stmt
> on updated stmts.

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59591


-- 
H.J.


PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index

2013-12-24 Thread H.J. Lu
Hi,

cpu_names in i386.c is only used by ix86_function_specific_print which
accesses it with enum processor_type index. But cpu_names is defined as
array with enum target_cpu_default index.  This patch adds processor
names to processor_target_table and uses processor_target_table instead
of cpu_names.  It removes cpu_names and target_cpu_default.  Tested on
Linux/x86-64.  OK to install?

Thanks.


H.J.
--
2013-12-24   H.J. Lu  

PR target/59587
* config/i386/i386.c (struct ptt): Add a field for processor
name.
(processor_target_table): Add processor names.
(cpu_names): Removed.
(ix86_option_override_internal): Default x_ix86_tune_string
to processor_target_table[TARGET_CPU_DEFAULT].name.
(ix86_function_specific_print): Use processor_target_table
to print arch and tune names.
* config/i386/i386.h (TARGET_CPU_DEFAULT): Default to
PROCESSOR_GENERIC.
(target_cpu_default): Removed.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ced6618..b9f0a63 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2375,6 +2375,7 @@ static tree ix86_veclibabi_acml (enum built_in_function, 
tree, tree);
 /* Processor target table, indexed by processor number */
 struct ptt
 {
+  const char *const name;  /* processor name  */
   const struct processor_costs *cost;  /* Processor costs */
   const int align_loop;/* Default alignments.  
*/
   const int align_loop_max_skip;
@@ -2385,81 +2386,30 @@ struct ptt
 
 static const struct ptt processor_target_table[PROCESSOR_max] =
 {
-  {&i386_cost, 4, 3, 4, 3, 4},
-  {&i486_cost, 16, 15, 16, 15, 16},
-  {&pentium_cost, 16, 7, 16, 7, 16},
-  {&pentiumpro_cost, 16, 15, 16, 10, 16},
-  {&geode_cost, 0, 0, 0, 0, 0},
-  {&k6_cost, 32, 7, 32, 7, 32},
-  {&athlon_cost, 16, 7, 16, 7, 16},
-  {&pentium4_cost, 0, 0, 0, 0, 0},
-  {&k8_cost, 16, 7, 16, 7, 16},
-  {&nocona_cost, 0, 0, 0, 0, 0},
-  /* Core 2  */
-  {&core_cost, 16, 10, 16, 10, 16},
-  /* Nehalem  */
-  {&core_cost, 16, 10, 16, 10, 16},
-  /* Sandy Bridge  */
-  {&core_cost, 16, 10, 16, 10, 16},
-  /* Haswell  */
-  {&core_cost, 16, 10, 16, 10, 16},
-  /* Bonnell  */
-  {&atom_cost, 16, 15, 16, 7, 16},
-  /* Silvermont  */
-  {&slm_cost, 16, 15, 16, 7, 16},
-  {&generic_cost, 16, 10, 16, 10, 16},
-  {&amdfam10_cost, 32, 24, 32, 7, 32},
-  {&bdver1_cost, 16, 10, 16, 7, 11},
-  {&bdver2_cost, 16, 10, 16, 7, 11},
-  {&bdver3_cost, 16, 10, 16, 7, 11},
-  {&bdver4_cost, 16, 10, 16, 7, 11},
-  {&btver1_cost, 16, 10, 16, 7, 11},
-  {&btver2_cost, 16, 10, 16, 7, 11}
-};
-
-static const char *const cpu_names[TARGET_CPU_DEFAULT_max] =
-{
-  "generic",
-  "i386",
-  "i486",
-  "pentium",
-  "pentium-mmx",
-  "pentiumpro",
-  "pentium2",
-  "pentium3",
-  "pentium4",
-  "pentium-m",
-  "prescott",
-  "nocona",
-  "core2",
-  "corei7",
-  "corei7-avx",
-  "core-avx2",
-  "atom",
-  "slm",
-  "nehalem",
-  "westmere",
-  "sandybridge",
-  "ivybridge",
-  "haswell",
-  "broadwell",
-  "bonnell",
-  "silvermont",
-  "intel",
-  "geode",
-  "k6",
-  "k6-2",
-  "k6-3",
-  "athlon",
-  "athlon-4",
-  "k8",
-  "amdfam10",
-  "bdver1",
-  "bdver2",
-  "bdver3",
-  "bdver4",
-  "btver1",
-  "btver2"
+  {"i386", &i386_cost, 4, 3, 4, 3, 4},
+  {"i486", &i486_cost, 16, 15, 16, 15, 16},
+  {"pentium", &pentium_cost, 16, 7, 16, 7, 16},
+  {"pentiumpro", &pentiumpro_cost, 16, 15, 16, 10, 16},
+  {"geode", &geode_cost, 0, 0, 0, 0, 0},
+  {"k6", &k6_cost, 32, 7, 32, 7, 32},
+  {"athlon", &athlon_cost, 16, 7, 16, 7, 16},
+  {"pentium4", &pentium4_cost, 0, 0, 0, 0, 0},
+  {"k8", &k8_cost, 16, 7, 16, 7, 16},
+  {"nocona", &nocona_cost, 0, 0, 0, 0, 0},
+  {"core2", &core_cost, 16, 10, 16, 10, 16},
+  {"nehalem", &core_cost, 16, 10, 16, 10, 16},
+  {"sandybridge", &core_cost, 16, 10, 16, 10, 16},
+  {"haswell", &core_cost, 16, 10, 16, 10, 16},
+  {"bonnell", &atom_cost, 16, 15, 16, 7, 16},
+  {"silvermont", &slm_cost, 16, 15, 16, 7, 16},
+  {"generic", &generic_cost, 16, 10, 16, 10, 16},
+  {"amdfam10", &amdfam10_cost, 32, 24, 32, 7, 32},
+  {"bdver1", &bdver1_cost, 16, 10, 16, 7, 11},
+  {"bdver2", &bdver2_cost, 16, 10, 16, 7, 11},
+  {"bdver3", &bdver3_cost, 16, 10, 16, 7, 11},
+  {"bdver4", &bdver4_cost, 16, 10, 16, 7, 11},
+  {"btver1", &btver1_cost, 16, 10, 16, 7, 11},
+  {"btver2", &btver2_cost, 16, 10, 16, 7, 11}
 };
 
 static bool
@@ -3360,7 +3310,8 @@ ix86_option_override_internal (bool main_args_p,
opts->x_ix86_tune_string = opts->x_ix86_arch_string;
   if (!opts->x_ix86_tune_string)
{
- opts->x_ix86_tune_string = cpu_names[TARGET_CPU_DEFAULT];
+ opts->x_ix86_tune_string
+   = processor_target_table[TARGET_CPU_DEFAULT].name;
  ix86_tune_defaulted = 1;
}
 
@@ -4413,17 +4364,11 @@ ix86_function_specific_print (FILE *file, int indent,
 
   fprintf (file, "%*sarch = %d (%s)\n",
   indent, "",
-  ptr->a

Re: PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index

2013-12-24 Thread Uros Bizjak
On Tue, Dec 24, 2013 at 2:08 PM, H.J. Lu  wrote:

> cpu_names in i386.c is only used by ix86_function_specific_print which
> accesses it with enum processor_type index. But cpu_names is defined as
> array with enum target_cpu_default index.  This patch adds processor
> names to processor_target_table and uses processor_target_table instead
> of cpu_names.  It removes cpu_names and target_cpu_default.  Tested on
> Linux/x86-64.  OK to install?
>
> Thanks.
>
>
> H.J.
> --
> 2013-12-24   H.J. Lu  
>
> PR target/59587
> * config/i386/i386.c (struct ptt): Add a field for processor
> name.
> (processor_target_table): Add processor names.
> (cpu_names): Removed.
> (ix86_option_override_internal): Default x_ix86_tune_string
> to processor_target_table[TARGET_CPU_DEFAULT].name.
> (ix86_function_specific_print): Use processor_target_table
> to print arch and tune names.
> * config/i386/i386.h (TARGET_CPU_DEFAULT): Default to
> PROCESSOR_GENERIC.
> (target_cpu_default): Removed.

Those two tables were hopelessly out of sync... so, great to get rid
of one, especially since the precision of target_cpu_default enum was
never needed.

The patch is OK (with some additional cleanups, as suggested below)
for mainline and 4.8 after a couple of days in mainline.

Thanks,
Uros.

> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index ced6618..b9f0a63 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2375,6 +2375,7 @@ static tree ix86_veclibabi_acml (enum 
> built_in_function, tree, tree);
>  /* Processor target table, indexed by processor number */
>  struct ptt
>  {
> +  const char *const name;  /* processor name  */
>const struct processor_costs *cost;  /* Processor costs */
>const int align_loop;/* Default 
> alignments.  */
>const int align_loop_max_skip;
> @@ -2385,81 +2386,30 @@ struct ptt
>
>  static const struct ptt processor_target_table[PROCESSOR_max] =

Please put "generic" entry at the top, and sort the table (and
corresponding enum processor_type in i386.h) in the same way as enum
target_cpu_default was. Please also add a comment in both places that
these tables need to be in sync.

>  {
> -  {&i386_cost, 4, 3, 4, 3, 4},
> -  {&i486_cost, 16, 15, 16, 15, 16},
> -  {&pentium_cost, 16, 7, 16, 7, 16},
> -  {&pentiumpro_cost, 16, 15, 16, 10, 16},
> -  {&geode_cost, 0, 0, 0, 0, 0},
> -  {&k6_cost, 32, 7, 32, 7, 32},
> -  {&athlon_cost, 16, 7, 16, 7, 16},
> -  {&pentium4_cost, 0, 0, 0, 0, 0},
> -  {&k8_cost, 16, 7, 16, 7, 16},
> -  {&nocona_cost, 0, 0, 0, 0, 0},
> -  /* Core 2  */
> -  {&core_cost, 16, 10, 16, 10, 16},
> -  /* Nehalem  */
> -  {&core_cost, 16, 10, 16, 10, 16},
> -  /* Sandy Bridge  */
> -  {&core_cost, 16, 10, 16, 10, 16},
> -  /* Haswell  */
> -  {&core_cost, 16, 10, 16, 10, 16},
> -  /* Bonnell  */
> -  {&atom_cost, 16, 15, 16, 7, 16},
> -  /* Silvermont  */
> -  {&slm_cost, 16, 15, 16, 7, 16},
> -  {&generic_cost, 16, 10, 16, 10, 16},
> -  {&amdfam10_cost, 32, 24, 32, 7, 32},
> -  {&bdver1_cost, 16, 10, 16, 7, 11},
> -  {&bdver2_cost, 16, 10, 16, 7, 11},
> -  {&bdver3_cost, 16, 10, 16, 7, 11},
> -  {&bdver4_cost, 16, 10, 16, 7, 11},
> -  {&btver1_cost, 16, 10, 16, 7, 11},
> -  {&btver2_cost, 16, 10, 16, 7, 11}
> -};
> -
> -static const char *const cpu_names[TARGET_CPU_DEFAULT_max] =
> -{
> -  "generic",
> -  "i386",
> -  "i486",
> -  "pentium",
> -  "pentium-mmx",
> -  "pentiumpro",
> -  "pentium2",
> -  "pentium3",
> -  "pentium4",
> -  "pentium-m",
> -  "prescott",
> -  "nocona",
> -  "core2",
> -  "corei7",
> -  "corei7-avx",
> -  "core-avx2",
> -  "atom",
> -  "slm",
> -  "nehalem",
> -  "westmere",
> -  "sandybridge",
> -  "ivybridge",
> -  "haswell",
> -  "broadwell",
> -  "bonnell",
> -  "silvermont",
> -  "intel",
> -  "geode",
> -  "k6",
> -  "k6-2",
> -  "k6-3",
> -  "athlon",
> -  "athlon-4",
> -  "k8",
> -  "amdfam10",
> -  "bdver1",
> -  "bdver2",
> -  "bdver3",
> -  "bdver4",
> -  "btver1",
> -  "btver2"
> +  {"i386", &i386_cost, 4, 3, 4, 3, 4},
> +  {"i486", &i486_cost, 16, 15, 16, 15, 16},
> +  {"pentium", &pentium_cost, 16, 7, 16, 7, 16},
> +  {"pentiumpro", &pentiumpro_cost, 16, 15, 16, 10, 16},
> +  {"geode", &geode_cost, 0, 0, 0, 0, 0},
> +  {"k6", &k6_cost, 32, 7, 32, 7, 32},
> +  {"athlon", &athlon_cost, 16, 7, 16, 7, 16},
> +  {"pentium4", &pentium4_cost, 0, 0, 0, 0, 0},
> +  {"k8", &k8_cost, 16, 7, 16, 7, 16},
> +  {"nocona", &nocona_cost, 0, 0, 0, 0, 0},
> +  {"core2", &core_cost, 16, 10, 16, 10, 16},
> +  {"nehalem", &core_cost, 16, 10, 16, 10, 16},
> +  {"sandybridge", &core_cost, 16, 10, 16, 10, 16},
> +  {"haswell", &core_cost, 16, 10, 16, 10, 16},
> +  {"bonnell", &atom_cost, 16, 15, 16, 7, 16},
> +  {"silvermont", &slm_cost, 16, 15, 16, 7, 16},
> +  {"generic", &generic_cost, 16, 10, 16, 10, 16},
> +  {"amdfam10", &amdfam10_cost, 32, 24, 32, 7, 32},
> +  {"bdver1", 

Re: PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index

2013-12-24 Thread Uros Bizjak
On Tue, Dec 24, 2013 at 2:08 PM, H.J. Lu  wrote:

> cpu_names in i386.c is only used by ix86_function_specific_print which
> accesses it with enum processor_type index. But cpu_names is defined as
> array with enum target_cpu_default index.  This patch adds processor
> names to processor_target_table and uses processor_target_table instead
> of cpu_names.  It removes cpu_names and target_cpu_default.  Tested on
> Linux/x86-64.  OK to install?

Wait a moment,

it looks to me that TARGET_CPU_DEFAULT has to be synchronized with
const processor_alias_table, so we are able to define various ISA
extensions by selecting TARGET_CPU_*. The TARGET_CPU_DEFAULT can then
be used to select extensions in the same way as PROCESSOR_* selects
tuning for certain processor.

Uros.


Re: PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index

2013-12-24 Thread H.J. Lu
On Tue, Dec 24, 2013 at 6:12 AM, Uros Bizjak  wrote:
> On Tue, Dec 24, 2013 at 2:08 PM, H.J. Lu  wrote:
>
>> cpu_names in i386.c is only used by ix86_function_specific_print which
>> accesses it with enum processor_type index. But cpu_names is defined as
>> array with enum target_cpu_default index.  This patch adds processor
>> names to processor_target_table and uses processor_target_table instead
>> of cpu_names.  It removes cpu_names and target_cpu_default.  Tested on
>> Linux/x86-64.  OK to install?
>
> Wait a moment,
>
> it looks to me that TARGET_CPU_DEFAULT has to be synchronized with
> const processor_alias_table, so we are able to define various ISA
> extensions by selecting TARGET_CPU_*. The TARGET_CPU_DEFAULT can then

TARGET_CPU_DEFAULT sets the default -mtune=, not -march=.

> be used to select extensions in the same way as PROCESSOR_* selects
> tuning for certain processor.

It has been like this for a long time.  For x86, TARGET_CPU_DEFAULT
isn't defined no matter which configure options are used.  We can
change config.gcc to set TARGET_CPU_DEFAULT to proper PROCESSOR_XXX or
set it to a string "xxx" for processor "xxx".
But GCC driver always passes -march=/-mtune= to toplev.c so that
TARGET_CPU_DEFAULT is normally used.


-- 
H.J.


Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-24 Thread Allan Sandfeld Jensen
On Monday 23 December 2013, H.J. Lu wrote:
> 
> If you use
> 
> {"corei7-avx", M_INTEL_COREI7_SANYBRIDGE},
> {"core-avx2", M_INTEL_COREI7_HASWELL},
> 
> will it cause any problems?  When there are both
> 
Actually I seems I don't need these definitions any more after your clean-up 
of Intel architecture names. I have attached patch with them removed (and 
named haswell enums back to corei7_haswell).

If both target("arch=corei7-avx") and target("arch=sandybridge") is present 
the dispatcher appears to choose "sandybridge". If you want a warning for 
duplicates in this case, I suggest adding it in a later patch.

`Allan
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 206179)
+++ gcc/config/i386/i386.c	(working copy)
@@ -29970,16 +29970,21 @@
 P_SSE3,
 P_SSSE3,
 P_PROC_SSSE3,
-P_SSE4_a,
-P_PROC_SSE4_a,
+P_SSE4_A,
+P_PROC_SSE4_A,
 P_SSE4_1,
 P_SSE4_2,
 P_PROC_SSE4_2,
 P_POPCNT,
 P_AVX,
+P_PROC_AVX,
+P_FMA4,
+P_XOP,
+P_PROC_XOP,
+P_FMA,
+P_PROC_FMA,
 P_AVX2,
-P_FMA,
-P_PROC_FMA
+P_PROC_AVX2
   };
 
  enum feature_priority priority = P_ZERO;
@@ -29998,11 +30003,15 @@
   {"sse", P_SSE},
   {"sse2", P_SSE2},
   {"sse3", P_SSE3},
+  {"sse4a", P_SSE4_A},
   {"ssse3", P_SSSE3},
   {"sse4.1", P_SSE4_1},
   {"sse4.2", P_SSE4_2},
   {"popcnt", P_POPCNT},
   {"avx", P_AVX},
+  {"fma4", P_FMA4},
+  {"xop", P_XOP},
+  {"fma", P_FMA},
   {"avx2", P_AVX2}
 };
 
@@ -30054,26 +30063,50 @@
 	  arg_str = "nehalem";
 	  priority = P_PROC_SSE4_2;
 	  break;
-case PROCESSOR_SANDYBRIDGE:
-  arg_str = "sandybridge";
-  priority = P_PROC_SSE4_2;
-  break;
+	case PROCESSOR_SANDYBRIDGE:
+	  arg_str = "sandybridge";
+	  priority = P_PROC_AVX;
+	  break;
+	case PROCESSOR_HASWELL:
+	  arg_str = "haswell";
+	  priority = P_PROC_AVX2;
+	  break;
 	case PROCESSOR_BONNELL:
 	  arg_str = "bonnell";
 	  priority = P_PROC_SSSE3;
 	  break;
+	case PROCESSOR_SILVERMONT:
+	  arg_str = "silvermont";
+	  priority = P_PROC_SSE4_2;
+	  break;
 	case PROCESSOR_AMDFAM10:
 	  arg_str = "amdfam10h";
-	  priority = P_PROC_SSE4_a;
+	  priority = P_PROC_SSE4_A;
 	  break;
+	case PROCESSOR_BTVER1:
+	  arg_str = "bobcat";
+	  priority = P_PROC_SSE4_A;
+	  break;
+	case PROCESSOR_BTVER2:
+	  arg_str = "jaguar";
+	  priority = P_PROC_AVX;
+	  break;
 	case PROCESSOR_BDVER1:
 	  arg_str = "bdver1";
-	  priority = P_PROC_FMA;
+	  priority = P_PROC_XOP;
 	  break;
 	case PROCESSOR_BDVER2:
 	  arg_str = "bdver2";
 	  priority = P_PROC_FMA;
 	  break;
+	case PROCESSOR_BDVER3:
+	  arg_str = "bdver3";
+	  priority = P_PROC_FMA;
+	  break;
+	case PROCESSOR_BDVER4:
+	  arg_str = "bdver4";
+	  priority = P_PROC_AVX2;
+	  break;
 	}  
 	}
 
@@ -30938,6 +30971,10 @@
 F_SSE4_2,
 F_AVX,
 F_AVX2,
+F_SSE4_A,
+F_FMA4,
+F_XOP,
+F_FMA,
 F_MAX
   };
 
@@ -30955,6 +30992,8 @@
 M_AMDFAM10H,
 M_AMDFAM15H,
 M_INTEL_SILVERMONT,
+M_AMD_BOBCAT,
+M_AMD_JAGUAR,
 M_CPU_SUBTYPE_START,
 M_INTEL_COREI7_NEHALEM,
 M_INTEL_COREI7_WESTMERE,
@@ -30965,7 +31004,9 @@
 M_AMDFAM15H_BDVER1,
 M_AMDFAM15H_BDVER2,
 M_AMDFAM15H_BDVER3,
-M_AMDFAM15H_BDVER4
+M_AMDFAM15H_BDVER4,
+M_INTEL_COREI7_IVYBRIDGE,
+M_INTEL_COREI7_HASWELL
   };
 
   static struct _arch_names_table
@@ -30984,15 +31025,21 @@
   {"nehalem", M_INTEL_COREI7_NEHALEM},
   {"westmere", M_INTEL_COREI7_WESTMERE},
   {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE},
+  {"ivybridge", M_INTEL_COREI7_IVYBRIDGE},
+  {"haswell", M_INTEL_COREI7_HASWELL},
+  {"bonnell", M_INTEL_BONNELL},
+  {"silvermont", M_INTEL_SILVERMONT},
   {"amdfam10h", M_AMDFAM10H},
   {"barcelona", M_AMDFAM10H_BARCELONA},
   {"shanghai", M_AMDFAM10H_SHANGHAI},
   {"istanbul", M_AMDFAM10H_ISTANBUL},
+  {"bobcat", M_AMD_BOBCAT},  
   {"amdfam15h", M_AMDFAM15H},
   {"bdver1", M_AMDFAM15H_BDVER1},
   {"bdver2", M_AMDFAM15H_BDVER2},
   {"bdver3", M_AMDFAM15H_BDVER3},
   {"bdver4", M_AMDFAM15H_BDVER4},
+  {"jaguar", M_AMD_JAGUAR},  
 };
 
   static struct _isa_names_table
@@ -31009,9 +31056,13 @@
   {"sse2",   F_SSE2},
   {"sse3",   F_SSE3},
   {"ssse3",  F_SSSE3},
+  {"sse4a",  F_SSE4_A},
   {"sse4.1", F_SSE4_1},
   {"sse4.2", F_SSE4_2},
   {"avx",F_AVX},
+  {"fma4",   F_FMA4},
+  {"xop",F_XOP},
+  {"fma",F_FMA},
   {"avx2",   F_AVX2}
 };
 
Index: gcc/testsuite/gcc.target/i386/funcspec-5.c
===
--- gcc/testsuite/

Re: [PATCH][ARM]Use of vcvt for float to fixed point conversions.

2013-12-24 Thread Renlin Li

Hi,

I just updated my patch according your suggestion.
Thank you for committing it for me!

All you guys have a nice Xmas break!

Kind regards,
Renlin Li

On 04/12/13 11:23, Ramana Radhakrishnan wrote:

Sorry about the slow response. Been on holiday.

On 20/11/13 16:27, Renlin Li wrote:

Hi all,

This patch will make the arm back-end use vcvt for float to fixed point
conversions when applicable.

Test on arm-none-linux-gnueabi has been done on the model.
Okay for trunk?

+ (define_insn "*combine_vcvtf2i"
+   [(set (match_operand:SI 0 "s_register_operand" "=r")
+   (fix:SI (fix:SF (mult:SF (match_operand:SF 1 "s_register_operand" "t")
+(match_operand 2
+"const_double_vcvt_power_of_two" "Dp")]
+   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP3 && !flag_rounding_math"
+   "vcvt%?.s32.f32\\t%1, %1, %v2\;vmov%?\\t%0, %1"
+   [(set_attr "predicable" "yes")
+(set_attr "predicable_short_it" "no")
+(set_attr "ce_count" "2")
+(set_attr "type" "f_cvtf2i")]
+ )
+

You need to set length to 8.


--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/fixed_float_conversion.c
@@ -0,0 +1,15 @@
+/* Check that vcvt is used for fixed and float data conversions.  */
+/* { dg-do compile } */
+/* { dg-options "-O1 -mfpu=vfp3" } */
+/* { dg-require-effective-target arm_vfp_ok } */
+float fixed_to_float(int i)
+{
+return ((float)i / (1 << 16));
+}
+
+int float_to_fixed(float f)
+{
+return ((int)(f*(1 << 16)));
+}
+/* { dg-final { scan-assembler "vcvt.f32.s32" } } */
+/* { dg-final { scan-assembler "vcvt.s32.f32" } } */


GNU coding style for functions.

Ok with those changes.




regards
Ramana



Kind regards,
Renlin Li


gcc/ChangeLog:

2013-11-20  Renlin Li  

   * config/arm/arm-protos.h (vfp_const_double_for_bits): Declare.
   * config/arm/constraints.md (Dp): Define new constraint.
   * config/arm/predicates.md ( const_double_vcvt_power_of_two): Define
   new predicate.
   * config/arm/arm.c (arm_print_operand): Add print for new fucntion.
   (vfp3_const_double_for_bits): New function.
   * config/arm/vfp.md (combine_vcvtf2i): Define new instruction.

gcc/testsuite/ChangeLog:

2013-11-20  Renlin Li  

   * gcc.target/arm/fixed_float_conversion.c: New test case.

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 944cf10..f2f8272 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -275,6 +275,8 @@ struct tune_params
 
 extern const struct tune_params *current_tune;
 extern int vfp3_const_double_for_fract_bits (rtx);
+/* return power of two from operand, otherwise 0.  */
+extern int vfp3_const_double_for_bits (rtx);
 
 extern void arm_emit_coreregs_64bit_shift (enum rtx_code, rtx, rtx, rtx, rtx,
 	   rtx);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 78554e8..72c4204 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21175,7 +21175,11 @@ arm_print_operand (FILE *stream, rtx x, int code)
 
 case 'v':
 	gcc_assert (CONST_DOUBLE_P (x));
-	fprintf (stream, "#%d", vfp3_const_double_for_fract_bits (x));
+	int result;
+	result = vfp3_const_double_for_fract_bits (x);
+	if (result == 0)
+	  result = vfp3_const_double_for_bits (x);
+	fprintf (stream, "#%d", result);
 	return;
 
 /* Register specifier for vld1.16/vst1.16.  Translate the S register
@@ -28958,6 +28973,26 @@ vfp3_const_double_for_fract_bits (rtx operand)
 }
   return 0;
 }
+
+int
+vfp3_const_double_for_bits (rtx operand)
+{
+  REAL_VALUE_TYPE r0;
+
+  if (!CONST_DOUBLE_P (operand))
+return 0;
+
+  REAL_VALUE_FROM_CONST_DOUBLE (r0, operand);
+  if (exact_real_truncate (DFmode, &r0))
+{
+  HOST_WIDE_INT value = real_to_integer (&r0);
+  value = value & 0x;
+  if ((value != 0) && ( (value & (value - 1)) == 0))
+	return int_log2 (value);
+}
+
+  return 0;
+}
 
 /* Emit a memory barrier around an atomic sequence according to MODEL.  */
 
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index e2a3099..59ca4b6 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -31,7 +31,7 @@
 ;; 'H' was previously used for FPA.
 
 ;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dz
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
 ;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
 
@@ -328,12 +328,18 @@
  (and (match_code "const_double")
   (match_test "TARGET_32BIT && TARGET_VFP_DOUBLE && vfp3_const_double_rtx (op)")))
 
-(define_constraint "Dt" 
+(define_constraint "Dt"
  "@internal
   In ARM/ Thumb2 a const_double which can be used with a vcvt.f32.s32 with fract bits operation"
   (and (match_code "const_double")
(match_test "TARGET_32BIT && TARGET_VFP && vfp3_const_double_for_fract_bits (op)")))
 
+(define_c

Re: PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index

2013-12-24 Thread Uros Bizjak
On Tue, Dec 24, 2013 at 3:23 PM, H.J. Lu  wrote:
> On Tue, Dec 24, 2013 at 6:12 AM, Uros Bizjak  wrote:
>> On Tue, Dec 24, 2013 at 2:08 PM, H.J. Lu  wrote:
>>
>>> cpu_names in i386.c is only used by ix86_function_specific_print which
>>> accesses it with enum processor_type index. But cpu_names is defined as
>>> array with enum target_cpu_default index.  This patch adds processor
>>> names to processor_target_table and uses processor_target_table instead
>>> of cpu_names.  It removes cpu_names and target_cpu_default.  Tested on
>>> Linux/x86-64.  OK to install?
>>
>> Wait a moment,
>>
>> it looks to me that TARGET_CPU_DEFAULT has to be synchronized with
>> const processor_alias_table, so we are able to define various ISA
>> extensions by selecting TARGET_CPU_*. The TARGET_CPU_DEFAULT can then
>
> TARGET_CPU_DEFAULT sets the default -mtune=, not -march=.
>
>> be used to select extensions in the same way as PROCESSOR_* selects
>> tuning for certain processor.
>
> It has been like this for a long time.  For x86, TARGET_CPU_DEFAULT
> isn't defined no matter which configure options are used.  We can
> change config.gcc to set TARGET_CPU_DEFAULT to proper PROCESSOR_XXX or
> set it to a string "xxx" for processor "xxx".
> But GCC driver always passes -march=/-mtune= to toplev.c so that
> TARGET_CPU_DEFAULT is normally used.

Let me rethink this a bit, please do not commit the patch.

Uros.


Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-24 Thread H.J. Lu
On Tue, Dec 24, 2013 at 6:38 AM, Allan Sandfeld Jensen
 wrote:
> On Monday 23 December 2013, H.J. Lu wrote:
>>
>> If you use
>>
>> {"corei7-avx", M_INTEL_COREI7_SANYBRIDGE},
>> {"core-avx2", M_INTEL_COREI7_HASWELL},
>> will it cause any problems?  When there are both
>>
> Actually I seems I don't need these definitions any more after your clean-up
> of Intel architecture names. I have attached patch with them removed (and
> named haswell enums back to corei7_haswell).

It looks good to me.  Thanks.

> If both target("arch=corei7-avx") and target("arch=sandybridge") is present
> the dispatcher appears to choose "sandybridge". If you want a warning for

This is OK with me.

> duplicates in this case, I suggest adding it in a later patch.

Will libgcc/config/i386/cpuinfo.c update be a separate patch?
Should we use a single definition for both i386.c and libgcc?

-- 
H.J.


Re: PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index

2013-12-24 Thread H.J. Lu
On Tue, Dec 24, 2013 at 6:50 AM, Uros Bizjak  wrote:
> On Tue, Dec 24, 2013 at 3:23 PM, H.J. Lu  wrote:
>> On Tue, Dec 24, 2013 at 6:12 AM, Uros Bizjak  wrote:
>>> On Tue, Dec 24, 2013 at 2:08 PM, H.J. Lu  wrote:
>>>
 cpu_names in i386.c is only used by ix86_function_specific_print which
 accesses it with enum processor_type index. But cpu_names is defined as
 array with enum target_cpu_default index.  This patch adds processor
 names to processor_target_table and uses processor_target_table instead
 of cpu_names.  It removes cpu_names and target_cpu_default.  Tested on
 Linux/x86-64.  OK to install?
>>>
>>> Wait a moment,
>>>
>>> it looks to me that TARGET_CPU_DEFAULT has to be synchronized with
>>> const processor_alias_table, so we are able to define various ISA
>>> extensions by selecting TARGET_CPU_*. The TARGET_CPU_DEFAULT can then
>>
>> TARGET_CPU_DEFAULT sets the default -mtune=, not -march=.
>>
>>> be used to select extensions in the same way as PROCESSOR_* selects
>>> tuning for certain processor.
>>
>> It has been like this for a long time.  For x86, TARGET_CPU_DEFAULT
>> isn't defined no matter which configure options are used.  We can
>> change config.gcc to set TARGET_CPU_DEFAULT to proper PROCESSOR_XXX or
>> set it to a string "xxx" for processor "xxx".
>> But GCC driver always passes -march=/-mtune= to toplev.c so that
>> TARGET_CPU_DEFAULT is normally used.

I meant to say "TARGET_CPU_DEFAULT isn't normally used."

>
> Let me rethink this a bit, please do not commit the patch.
>
> Uros.

Sure.

-- 
H.J.


Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-24 Thread Allan Sandfeld Jensen
On Tuesday 24 December 2013, H.J. Lu wrote:
> 
> Will libgcc/config/i386/cpuinfo.c update be a separate patch?
> Should we use a single definition for both i386.c and libgcc?

Currently they need to be in the same patch. But yes, moving the definition 
out to a common header would probably be a good idea to reduce potential 
mismatches in future.

How does the patch get commited after being accepted? It has been many years 
since I last contributed to gcc, and I can not remember the rest of the 
process, and doubt it is still the same.

Regards
`Allan


Re: PATCH: PR target/59587: cpu_names in i386.c is accessed with wrong index

2013-12-24 Thread H.J. Lu
On Tue, Dec 24, 2013 at 6:55 AM, H.J. Lu  wrote:
> On Tue, Dec 24, 2013 at 6:50 AM, Uros Bizjak  wrote:
>> On Tue, Dec 24, 2013 at 3:23 PM, H.J. Lu  wrote:
>>> On Tue, Dec 24, 2013 at 6:12 AM, Uros Bizjak  wrote:
 On Tue, Dec 24, 2013 at 2:08 PM, H.J. Lu  wrote:

> cpu_names in i386.c is only used by ix86_function_specific_print which
> accesses it with enum processor_type index. But cpu_names is defined as
> array with enum target_cpu_default index.  This patch adds processor
> names to processor_target_table and uses processor_target_table instead
> of cpu_names.  It removes cpu_names and target_cpu_default.  Tested on
> Linux/x86-64.  OK to install?

 Wait a moment,

 it looks to me that TARGET_CPU_DEFAULT has to be synchronized with
 const processor_alias_table, so we are able to define various ISA
 extensions by selecting TARGET_CPU_*. The TARGET_CPU_DEFAULT can then
>>>
>>> TARGET_CPU_DEFAULT sets the default -mtune=, not -march=.
>>>
 be used to select extensions in the same way as PROCESSOR_* selects
 tuning for certain processor.
>>>
>>> It has been like this for a long time.  For x86, TARGET_CPU_DEFAULT
>>> isn't defined no matter which configure options are used.  We can
>>> change config.gcc to set TARGET_CPU_DEFAULT to proper PROCESSOR_XXX or
>>> set it to a string "xxx" for processor "xxx".
>>> But GCC driver always passes -march=/-mtune= to toplev.c so that
>>> TARGET_CPU_DEFAULT is normally used.
>
> I meant to say "TARGET_CPU_DEFAULT isn't normally used."
>
>>
>> Let me rethink this a bit, please do not commit the patch.
>>

TARGET_CPU_DEFAULT is left over for 32-bit target before --with-arch=
and --with-cpu= were added.  Today, -mtune=xxx -march=xxx are
always passed to cc1 by GCC driver.  If cc1 is run by hand and
-mtune=xxx -march=xxx aren't passed to cc1, we should do

1. For 64-bit, it should be the same as -mtune=generic -march=x86_64
are passed.
2. For 32-bit, it should be the same as -mtune=cpu -march=cpu are
passed, where "cpu" is the target cpu used to configure GCC,
like i386 in i386-linux, i486 in i486-linux,  But there is no i786
cpu.  i786 is treated as i686.  If SUBTARGET32_DEFAULT_CPU
is defined, it should be the same -mtune=SUBTARGET32_DEFAULT_CPU
-march=SUBTARGET32_DEFAULT_CPU.

Here is the patch to implement this.


-- 
H.J.
--
2013-12-24   H.J. Lu  

PR target/59587
* configure.ac (target_cpu_default): Defined to PROCESSOR_XXX
for i[34567]86 targets.
* configure: Regenerated.
* config/i386/i386.c (SUBTARGET32_DEFAULT_CPU): Use
TARGET_CPU_DEFAULT if it is defined.
(struct ptt): Add a field for processor name.
(processor_target_table): Sync with processor_type.  Add processor
names.
(cpu_names): Removed.
(ix86_option_override_internal): Default x_ix86_tune_string
to processor_target_table[TARGET_CPU_DEFAULT].name for 32-bit
if it is defined.  Otherwise, default to "generic".
(ix86_function_specific_print): Use processor_target_table
to print arch and tune names.
* config/i386/i386.h (TARGET_CPU_DEFAULT): Removed.
(target_cpu_default): Likewise.
(processor_type): Reordered.
2013-12-24   H.J. Lu  

PR target/59587
* configure.ac (target_cpu_default): Defined to PROCESSOR_XXX
for i[34567]86 targets.
* configure: Regenerated.
* config/i386/i386.c (SUBTARGET32_DEFAULT_CPU): Use
TARGET_CPU_DEFAULT if it is defined.
(struct ptt): Add a field for processor name.
(processor_target_table): Sync with processor_type.  Add processor
names.
(cpu_names): Removed.
(ix86_option_override_internal): Default x_ix86_tune_string
to processor_target_table[TARGET_CPU_DEFAULT].name for 32-bit
if it is defined.  Otherwise, default to "generic".
(ix86_function_specific_print): Use processor_target_table
to print arch and tune names.
* config/i386/i386.h (TARGET_CPU_DEFAULT): Removed.
(target_cpu_default): Likewise.
(processor_type): Reordered.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ced6618..8d9059d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2359,7 +2359,11 @@ static enum calling_abi ix86_function_abi (const_tree);
 
 
 #ifndef SUBTARGET32_DEFAULT_CPU
-#define SUBTARGET32_DEFAULT_CPU "i386"
+# ifdef TARGET_CPU_DEFAULT
+#  define SUBTARGET32_DEFAULT_CPU 
processor_target_table[TARGET_CPU_DEFAULT].name
+# else
+#  define SUBTARGET32_DEFAULT_CPU "i386"
+# endif
 #endif
 
 /* Whether -mtune= or -march= were specified */
@@ -2375,6 +2379,7 @@ static tree ix86_veclibabi_acml (enum built_in_function, 
tree, tree);
 /* Processor target table, indexed by processor number */
 struct ptt
 {
+  const char *const name;  /* processor name  */
   const struct processor_costs *cost;  /* Processor costs */
   const int align_loop; 

Re: [PATCH] Fix PR58626, compute proper partition dependences in loop distribution

2013-12-24 Thread H.J. Lu
On Thu, Oct 24, 2013 at 7:33 AM, Richard Biener  wrote:
>
> This finally computes a valid partition ordering (or if not possible
> merge partitions again) in loop distribution.  This is the only
> place where data dependences are relevant, so we delay computing them
> (and also do not compute all N^2 dependences).
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu with BOOT_CFLAGS
> -O2 -ftree-loop-distribution, BOOT_CFLAGS -O3 still running.
>
> Richard.
>
> 2013-10-24  Richard Biener  
>
> PR tree-optimization/58626
> * tree-loop-distribution.c (enum rdg_dep_type): Remove
> anti_dd, output_dd and input_dd.
> (struct rdg_edge): Remove level and relation members.
> (RDGE_LEVEL, RDGE_RELATION): Remove.
> (dot_rdg_1): Adjust.
> (create_rdg_edge_for_ddr): Remove.
> (create_rdg_edges_for_scalar): Adjust.
> (create_edge_for_control_dependence): Likewise.
> (create_rdg_edges): Split into ...
> (create_rdg_flow_edges): ... this
> (create_rdg_cd_edges): ... and this.
> (free_rdg): Adjust.
> (build_rdg): Likewise, do not compute data dependences or
> add edges for them.
> (pg_add_dependence_edges): New function.
> (pgcmp): Likewise.
> (distribute_loop): First apply all non-dependence based
> partition mergings.  Then compute dependences between partitions
> and merge and order partitions according to them.
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

-- 
H.J.


Re: [PATCH] Add -mtune=ia support

2013-12-24 Thread H.J. Lu
On Fri, Dec 6, 2013 at 9:38 AM, H.J. Lu  wrote:
> On Fri, Dec 6, 2013 at 2:44 AM, Uros Bizjak  wrote:
>> On Fri, Dec 6, 2013 at 10:38 AM, Richard Biener
>>  wrote:
>>> On Thu, Dec 5, 2013 at 10:05 PM, H.J. Lu  wrote:
 On Thu, Dec 5, 2013 at 1:02 PM, Patrick Marlier
  wrote:
> Hi,
>
>
> On 12/05/2013 07:22 PM, H.J. Lu wrote:
>>
>> We'd like to add a new -mtune=ia option for x86 to optimize for both
>> Haswell and Silvermont.  Currently, -mtune=ia is aliased to -mtune=slm.
>> We will improve it further for Haswell and Silvermont.  Later, we will
>> update it to future Intel processors.
>
>
> At first, 'ia' means to me Itanium, ie IA-64. I would personally prefer
> another name but maybe I am the only one to think that.
>

 "ia" stands for Intel Architecture.  It is the natural name for
 this option.
>>>
>>> I think "ia" and the natural "aa" are too obfuscated.  Why didn't you
>>> chose simply "intel" here?  (will the next patch add -mtune=a as
>>> that's natural for "AMD"?)
>>
>> -mtune=intel indeed sounds better.
>>
>
> This is the patch I checked in.
>
> Thanks.
>
> --
> H.J.
> --
> 2013-12-06  H.J. Lu  
>
> * config.gcc: Change --with-cpu=ia to --with-cpu=intel.
>
> * config/i386/i386.c (cpu_names): Replace "ia" with "intel".
> (processor_alias_table): Likewise.
> (ix86_option_override_internal): Likewise.
> * config/i386/i386.h (target_cpu_default): Replace
> TARGET_CPU_DEFAULT_ia with TARGET_CPU_DEFAULT_intel.
>
> * doc/invoke.texi: Replace -mtune=ia with -mtune=intel.

> @@ -3632,8 +3632,8 @@ ix86_option_override_internal (bool main_args_p,
>if (!strcmp (opts->x_ix86_arch_string, "generic"))
>  error ("generic CPU can be used only for %stune=%s %s",
> prefix, suffix, sw);
> -  else if (!strcmp (ix86_arch_string, "ia"))
> -error ("ia CPU can be used only for %stune=%s %s",
> +  else if (!strcmp (ix86_arch_string, "intel"))
> +error ("intel CPU can be used only for %stune=%s %s",
> prefix, suffix, sw);
>else if (!strncmp (opts->x_ix86_arch_string, "generic", 7) || i == 
> pta_size)
>  error ("bad value (%s) for %sarch=%s %s",

There is a typo.  I should check opts->x_ix86_arch_string, not
ix86_arch_string.  I am checking in this patch as obvious fix.

-- 
H.J.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3e9214a..8c756a6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2013-12-24   H.J. Lu  
+
+* config/i386/i386.c (ix86_option_override_internal): Check
+opts->x_ix86_arch_string instead of ix86_arch_string.
+
 2013-12-24  Renlin Li  

 * config/arm/arm-protos.h (vfp_const_double_for_bits): Declare.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ced6618..f5d9ce5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3645,7 +3645,7 @@ ix86_option_override_internal (bool main_args_p,
   if (!strcmp (opts->x_ix86_arch_string, "generic"))
 error ("generic CPU can be used only for %stune=%s %s",
prefix, suffix, sw);
-  else if (!strcmp (ix86_arch_string, "intel"))
+  else if (!strcmp (opts->x_ix86_arch_string, "intel"))
 error ("intel CPU can be used only for %stune=%s %s",
prefix, suffix, sw);
   else if (!strncmp (opts->x_ix86_arch_string, "generic", 7) || i == pta_size)


Re: [RFA][PATCH][PR middle-end/59285] BARRIERS and merged blocks

2013-12-24 Thread Steven Bosscher
On Fri, Dec 20, 2013 at 7:30 PM, Jeff Law wrote:
>
> So here's an alternate approach to fixing 59285.  I still think attacking
> this in rtl_merge_blocks is better, but with nobody else chiming in to break
> the deadlock Steven and myself are in, I'll go with Steven's preferred
> solution (fix the callers in ifcvt.c).

I didn't intend to cause a deadlock, I only really want us to respect
the rules of the CFG, one of which is that you can't merge two basic
blocks that are not connected by an edge. I think this is a really
important invariant because it avoids accidental basic block merges
that are not correct.


> If we were to return to a "fix rtl_merge_blocks" approach, I would revamp
> that patch to utilize the ideas in this one.  Namely that it's not just
> barriers between the merged blocks that are a problem.  In fact, that's a
> symptom of the problem.  Things have already gone wrong by that point.

What has gone wrong at that point, is that we'd be trying to merge two
basic blocks that have no control flow connection. The case of
builtin_unreachable (the only legitimate case for an empty basic block
without successors) is a special case. (This is the reason why I would
like us to have a special instruction or some kind of other marker for
builtin_unreachable...)


> Given blocks A & B that will be merged.  If A has > 1 successor and B has no
> successors, the combined block will always have at least 1 successor.
> However, the combined block will be followed by a BARRIER that must be
> removed.

Note this would happen automatically if there as an edge connecting
the blocks and a JUMP_INSN ending block B.

I propose we just punt on optimizing this case for now. For GCC 4.10
we should define what behavior should result from builtin_unreachable
(Should it trap? Can it be optimized away after a while to avoid these
unnecessary conditional jumps? ...) but for the moment it seems wrong
IMHO to only optimize this in the cond_exec case and to do so against
the rules of the control flow graph.

Something like the patch below, tested with a cross-compiler for arm-eabi.
What do you think of this approach?


PR middle-end/59285
* ifcvt. (cond_exec_find_if_block): Do not try to if-convert an empty
basic block without successors due to builtin_unreachable.

Index: ifcvt.c
===
--- ifcvt.c (revision 206195)
+++ ifcvt.c (working copy)
@@ -3495,6 +3495,13 @@ cond_exec_find_if_block (struct ce_if_block * ce_i
  Check for the last insn of the THEN block being an indirect jump, which
  is listed as not having any successors, but confuses the rest of the CE
  code processing.  ??? we should fix this in the future.  */
+  /* To make things worse: A block that ended in builtin_unreachable is
+ usually empty.  Perhaps we should optimize these away, but the semantics
+ of builtin_unreachable are not really clear about this, and if we do
+ optimize builtin_unreachable here (i.e. in the cond_exec path) we have
+ a strange difference of semantics of builtin_unreachable on cond_exec
+ and non-cond_exec targets.  Therefore, at least for now, don't merge
+ away a builtin_unreachable block.  */
   if (EDGE_COUNT (then_bb->succs) == 0)
 {
   if (single_pred_p (else_bb) && else_bb != EXIT_BLOCK_PTR_FOR_FN (cfun))
@@ -3511,6 +3518,12 @@ cond_exec_find_if_block (struct ce_if_block * ce_i
  && ! simplejump_p (last_insn))
return FALSE;

+ /* Empty block (no non-note insns anyway) only happens with
+builtin_unreachable.  merge_if_blocks isn't prepared for
+that.  See PR59285.  */
+ if (last_insn == BB_HEAD (then_bb))
+   return FALSE;
+
  join_bb = else_bb;
  else_bb = NULL_BLOCK;
}


PATCH: PR target/59588: Don't check/change generic/i686 tuning

2013-12-24 Thread H.J. Lu
Hi Honza,

We have combined generic32 and generic64 into generic.  There is no need
to check "generic" anymore.  Also we shouldn't change -mtune=i686 into
-mtune=generic.  OK to install?

Thanks.

H.J.
---
gcc/

2013-12-24   H.J. Lu  

PR target/59588
* config/i386/i386.c (ix86_option_override_internal): Don't
check generic tuning.  Don't change i686 tuning.

gcc/testsuite/

2013-12-24   H.J. Lu  

PR target/59588
* gcc.target/i386/pr59588-1.c: New file.
* gcc.target/i386/pr59588-2.c: Likewise.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f5d9ce5..b95a620 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3332,23 +3332,13 @@ ix86_option_override_internal (bool main_args_p,
   /* Need to check -mtune=generic first.  */
   if (opts->x_ix86_tune_string)
 {
-  if (!strcmp (opts->x_ix86_tune_string, "generic")
- || !strcmp (opts->x_ix86_tune_string, "i686")
- /* As special support for cross compilers we read -mtune=native
+  /* As special support for cross compilers we read -mtune=native
 as -mtune=generic.  With native compilers we won't see the
 -mtune=native, as it was changed by the driver.  */
- || !strcmp (opts->x_ix86_tune_string, "native"))
+  if (!strcmp (opts->x_ix86_tune_string, "native"))
{
  opts->x_ix86_tune_string = "generic";
}
-  /* If this call is for setting the option attribute, allow the
-generic that was previously set.  */
-  else if (!main_args_p
-  && !strcmp (opts->x_ix86_tune_string, "generic"))
-   ;
-  else if (!strncmp (opts->x_ix86_tune_string, "generic", 7))
-error ("bad value (%s) for %stune=%s %s",
-  opts->x_ix86_tune_string, prefix, suffix, sw);
   else if (!strcmp (opts->x_ix86_tune_string, "x86-64"))
 warning (OPT_Wdeprecated, "%stune=x86-64%s is deprecated; use "
  "%stune=k8%s or %stune=generic%s instead as appropriate",
@@ -3366,9 +3356,7 @@ ix86_option_override_internal (bool main_args_p,
 
   /* opts->x_ix86_tune_string is set to opts->x_ix86_arch_string
 or defaulted.  We need to use a sensible tune option.  */
-  if (!strcmp (opts->x_ix86_tune_string, "generic")
- || !strcmp (opts->x_ix86_tune_string, "x86-64")
- || !strcmp (opts->x_ix86_tune_string, "i686"))
+  if (!strcmp (opts->x_ix86_tune_string, "x86-64"))
{
  opts->x_ix86_tune_string = "generic";
}
@@ -3648,7 +3636,7 @@ ix86_option_override_internal (bool main_args_p,
   else if (!strcmp (opts->x_ix86_arch_string, "intel"))
 error ("intel CPU can be used only for %stune=%s %s",
   prefix, suffix, sw);
-  else if (!strncmp (opts->x_ix86_arch_string, "generic", 7) || i == pta_size)
+  else if (i == pta_size)
 error ("bad value (%s) for %sarch=%s %s",
   opts->x_ix86_arch_string, prefix, suffix, sw);
 
diff --git a/gcc/testsuite/gcc.target/i386/pr59588-1.c 
b/gcc/testsuite/gcc.target/i386/pr59588-1.c
new file mode 100644
index 000..391f2aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr59588-1.c
@@ -0,0 +1,7 @@
+/* { dg-do preprocess } */
+/* { dg-require-effective-target ia32 } */
+/* { dg-options "-march=i686" } */
+
+#ifndef __tune_i686__
+#error "__tune_i686__ should defined for this test"
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/pr59588-2.c 
b/gcc/testsuite/gcc.target/i386/pr59588-2.c
new file mode 100644
index 000..bb5f12a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr59588-2.c
@@ -0,0 +1,7 @@
+/* { dg-do preprocess } */
+/* { dg-require-effective-target ia32 } */
+/* { dg-options "-mtune=i686" } */
+
+#ifndef __tune_i686__
+#error "__tune_i686__ should defined for this test"
+#endif