Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-29 Thread Jakub Jelinek
On Wed, Jun 29, 2011 at 09:49:52AM +0200, Jan Hubicka wrote:
> > * config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask 
> > option.
> > 
> > * config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL 
> > entry.
> > (TARGET_AVX128_OPTIMAL): New definition.
> > 
> > * config/i386/i386.c (initial_ix86_tune_features): Initialize
> > X86_TUNE_AVX128_OPTIMAL entry.
> > (ix86_option_override_internal): Enable the generation
> > of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
> > (ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
> > (ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
> 
> OK for mainline.  For 4.6 it is RM's call.

For 4.6 it is fine as well.

Jakub


Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-29 Thread Jan Hubicka
>   * config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask 
> option.
> 
>   * config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL 
> entry.
>   (TARGET_AVX128_OPTIMAL): New definition.
> 
>   * config/i386/i386.c (initial_ix86_tune_features): Initialize
>   X86_TUNE_AVX128_OPTIMAL entry.
>   (ix86_option_override_internal): Enable the generation
>   of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
>   (ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
>   (ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.

OK for mainline.  For 4.6 it is RM's call.

Thanks,
Honza


RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-28 Thread Fang, Changpeng
Hi, 

 I re-attached the patch here. Can someone review it?

We would like to commit to trunk as well as 4.6 branch.

Thanks,

Changpeng




From: Fang, Changpeng
Sent: Monday, June 27, 2011 5:42 PM
To: Fang, Changpeng; Jan Hubicka
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; rguent...@suse.de
Subject: RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

Is this patch OK to commit to trunk?

Also I would like to backport this patch to gcc 4.6 branch. Do I have to send a 
separate
request or use this one?

Thanks,

Changpeng





From: Fang, Changpeng
Sent: Friday, June 24, 2011 7:12 PM
To: Jan Hubicka
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; rguent...@suse.de
Subject: RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

Hi,

 I have no preference in tune feature coding. But I agree with you it's better 
to
put similar things together. I modified the code following your suggestion.

Is it OK to commit this modified patch?

Thanks,

Changpeng




From: Jan Hubicka [hubi...@ucw.cz]
Sent: Thursday, June 23, 2011 6:20 PM
To: Fang, Changpeng
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; hubi...@ucw.cz; rguent...@suse.de
Subject: Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

Hi,
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2128,6 +2128,9 @@ static const unsigned int 
> x86_avx256_split_unaligned_load
>  static const unsigned int x86_avx256_split_unaligned_store
>= m_COREI7 | m_BDVER1 | m_GENERIC;
>
> +static const unsigned int x86_prefer_avx128
> +  = m_BDVER1;

What is reason for stuff like this to not go into initial_ix86_tune_features?
I sort of liked them better when they was individual flags, but having the 
target
tunning flags spread across multiple places seems unnecesary.

Honza

From a325395439a314f87b3c79a5b9ce79a6a976a710 Mon Sep 17 00:00:00 2001
From: Changpeng Fang 
Date: Wed, 22 Jun 2011 15:03:05 -0700
Subject: [PATCH] Auto-vectorizer generates 128-bit AVX insns by default for bdver1

	* config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option.

	* config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL entry.
	(TARGET_AVX128_OPTIMAL): New definition.

	* config/i386/i386.c (initial_ix86_tune_features): Initialize
	X86_TUNE_AVX128_OPTIMAL entry.
	(ix86_option_override_internal): Enable the generation
	of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
	(ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
	(ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
---
 gcc/config/i386/i386.c   |   16 
 gcc/config/i386/i386.h   |4 +++-
 gcc/config/i386/i386.opt |2 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 014401b..b3434dd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2089,7 +2089,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
   /* X86_SOFTARE_PREFETCHING_BENEFICIAL: Enable software prefetching
  at -O3.  For the moment, the prefetching seems badly tuned for Intel
  chips.  */
-  m_K6_GEODE | m_AMD_MULTIPLE
+  m_K6_GEODE | m_AMD_MULTIPLE,
+
+  /* X86_TUNE_AVX128_OPTIMAL: Enable 128-bit AVX instruction generation for
+ the auto-vectorizer.  */
+  m_BDVER1
 };
 
 /* Feature tests against the various architecture variations.  */
@@ -2623,6 +2627,7 @@ ix86_target_string (int isa, int flags, const char *arch, const char *tune,
 { "-mvzeroupper",			MASK_VZEROUPPER },
 { "-mavx256-split-unaligned-load",	MASK_AVX256_SPLIT_UNALIGNED_LOAD},
 { "-mavx256-split-unaligned-store",	MASK_AVX256_SPLIT_UNALIGNED_STORE},
+{ "-mprefer-avx128",		MASK_PREFER_AVX128},
   };
 
   const char *opts[ARRAY_SIZE (isa_opts) + ARRAY_SIZE (flag_opts) + 6][2];
@@ -3672,6 +3677,9 @@ ix86_option_override_internal (bool main_args_p)
 	  if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
 	  && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
 	target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
+	  /* Enable 128-bit AVX instruction generation for the auto-vectorizer.  */
+	  if (TARGET_AVX128_OPTIMAL && !(target_flags_explicit & MASK_PREFER_AVX128))
+	target_flags |= MASK_PREFER_AVX128;
 	}
 }
   else 
@@ -34614,7 +34622,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
   return V2DImode;
 
 case SFmode:
-  if (TARGET_AVX && !flag_prefer_avx128)
+  if (TARGET_AVX && !TARGET_PREFER_AVX128)
 	return V8SFmode;
   else
 	return V4SFmode;
@@ -34622,7 +34630,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
 case DFmode:
   if (!TARGET_VECTORIZE_DOUBLE)
 	return word_mode;
-  else if (TARGET_AVX && !flag_prefer_avx128)
+  else if (

RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-27 Thread Fang, Changpeng
Is this patch OK to commit to trunk?

Also I would like to backport this patch to gcc 4.6 branch. Do I have to send a 
separate 
request or use this one?

Thanks,

Changpeng





From: Fang, Changpeng
Sent: Friday, June 24, 2011 7:12 PM
To: Jan Hubicka
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; rguent...@suse.de
Subject: RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

Hi,

 I have no preference in tune feature coding. But I agree with you it's better 
to
put similar things together. I modified the code following your suggestion.

Is it OK to commit this modified patch?

Thanks,

Changpeng




From: Jan Hubicka [hubi...@ucw.cz]
Sent: Thursday, June 23, 2011 6:20 PM
To: Fang, Changpeng
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; hubi...@ucw.cz; rguent...@suse.de
Subject: Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

Hi,
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2128,6 +2128,9 @@ static const unsigned int 
> x86_avx256_split_unaligned_load
>  static const unsigned int x86_avx256_split_unaligned_store
>= m_COREI7 | m_BDVER1 | m_GENERIC;
>
> +static const unsigned int x86_prefer_avx128
> +  = m_BDVER1;

What is reason for stuff like this to not go into initial_ix86_tune_features?
I sort of liked them better when they was individual flags, but having the 
target
tunning flags spread across multiple places seems unnecesary.

Honza




RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-24 Thread Fang, Changpeng
Hi,

 I have no preference in tune feature coding. But I agree with you it's better 
to
put similar things together. I modified the code following your suggestion.

Is it OK to commit this modified patch?

Thanks,

Changpeng




From: Jan Hubicka [hubi...@ucw.cz]
Sent: Thursday, June 23, 2011 6:20 PM
To: Fang, Changpeng
Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; hubi...@ucw.cz; rguent...@suse.de
Subject: Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

Hi,
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2128,6 +2128,9 @@ static const unsigned int 
> x86_avx256_split_unaligned_load
>  static const unsigned int x86_avx256_split_unaligned_store
>= m_COREI7 | m_BDVER1 | m_GENERIC;
>
> +static const unsigned int x86_prefer_avx128
> +  = m_BDVER1;

What is reason for stuff like this to not go into initial_ix86_tune_features?
I sort of liked them better when they was individual flags, but having the 
target
tunning flags spread across multiple places seems unnecesary.

Honza

From a325395439a314f87b3c79a5b9ce79a6a976a710 Mon Sep 17 00:00:00 2001
From: Changpeng Fang 
Date: Wed, 22 Jun 2011 15:03:05 -0700
Subject: [PATCH] Auto-vectorizer generates 128-bit AVX insns by default for bdver1

	* config/i386/i386.opt (mprefer-avx128): Redefine the flag as a Mask option.

	* config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_AVX128_OPTIMAL entry.
	(TARGET_AVX128_OPTIMAL): New definition.

	* config/i386/i386.c (initial_ix86_tune_features): Initialize
	X86_TUNE_AVX128_OPTIMAL entry.
	(ix86_option_override_internal): Enable the generation
	of the 128-bit instructions when TARGET_AVX128_OPTIMAL is set.
	(ix86_preferred_simd_mode): Use TARGET_PREFER_AVX128.
	(ix86_autovectorize_vector_sizes): Use TARGET_PREFER_AVX128.
---
 gcc/config/i386/i386.c   |   16 
 gcc/config/i386/i386.h   |4 +++-
 gcc/config/i386/i386.opt |2 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 014401b..b3434dd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2089,7 +2089,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
   /* X86_SOFTARE_PREFETCHING_BENEFICIAL: Enable software prefetching
  at -O3.  For the moment, the prefetching seems badly tuned for Intel
  chips.  */
-  m_K6_GEODE | m_AMD_MULTIPLE
+  m_K6_GEODE | m_AMD_MULTIPLE,
+
+  /* X86_TUNE_AVX128_OPTIMAL: Enable 128-bit AVX instruction generation for
+ the auto-vectorizer.  */
+  m_BDVER1
 };
 
 /* Feature tests against the various architecture variations.  */
@@ -2623,6 +2627,7 @@ ix86_target_string (int isa, int flags, const char *arch, const char *tune,
 { "-mvzeroupper",			MASK_VZEROUPPER },
 { "-mavx256-split-unaligned-load",	MASK_AVX256_SPLIT_UNALIGNED_LOAD},
 { "-mavx256-split-unaligned-store",	MASK_AVX256_SPLIT_UNALIGNED_STORE},
+{ "-mprefer-avx128",		MASK_PREFER_AVX128},
   };
 
   const char *opts[ARRAY_SIZE (isa_opts) + ARRAY_SIZE (flag_opts) + 6][2];
@@ -3672,6 +3677,9 @@ ix86_option_override_internal (bool main_args_p)
 	  if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
 	  && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
 	target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
+	  /* Enable 128-bit AVX instruction generation for the auto-vectorizer.  */
+	  if (TARGET_AVX128_OPTIMAL && !(target_flags_explicit & MASK_PREFER_AVX128))
+	target_flags |= MASK_PREFER_AVX128;
 	}
 }
   else 
@@ -34614,7 +34622,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
   return V2DImode;
 
 case SFmode:
-  if (TARGET_AVX && !flag_prefer_avx128)
+  if (TARGET_AVX && !TARGET_PREFER_AVX128)
 	return V8SFmode;
   else
 	return V4SFmode;
@@ -34622,7 +34630,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
 case DFmode:
   if (!TARGET_VECTORIZE_DOUBLE)
 	return word_mode;
-  else if (TARGET_AVX && !flag_prefer_avx128)
+  else if (TARGET_AVX && !TARGET_PREFER_AVX128)
 	return V4DFmode;
   else if (TARGET_SSE2)
 	return V2DFmode;
@@ -34639,7 +34647,7 @@ ix86_preferred_simd_mode (enum machine_mode mode)
 static unsigned int
 ix86_autovectorize_vector_sizes (void)
 {
-  return (TARGET_AVX && !flag_prefer_avx128) ? 32 | 16 : 0;
+  return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
 }
 
 /* Initialize the GCC target structure.  */
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 8badcbb..d9317ed 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -312,6 +312,7 @@ enum ix86_tune_indices {
   X86_TUNE_OPT_AGU,
   X86_TUNE_VECTORIZE_DOUBLE,
   X86_TUNE_SOFTWARE_PREFETCHING_BENEFICIAL,
+  X86_TUNE_AVX128_OPTIMAL,
 
   X86_TUNE_LAST
 };
@@ -410,7 +411,8 @@ extern unsigned char ix86_

Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-23 Thread Jan Hubicka
Hi,
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2128,6 +2128,9 @@ static const unsigned int 
> x86_avx256_split_unaligned_load
>  static const unsigned int x86_avx256_split_unaligned_store
>= m_COREI7 | m_BDVER1 | m_GENERIC;
>  
> +static const unsigned int x86_prefer_avx128
> +  = m_BDVER1;

What is reason for stuff like this to not go into initial_ix86_tune_features?
I sort of liked them better when they was individual flags, but having the 
target
tunning flags spread across multiple places seems unnecesary.

Honza


Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-23 Thread Jakub Jelinek
On Thu, Jun 23, 2011 at 03:41:01PM -0500, Fang, Changpeng wrote:
> This patch enables 128-bit avx instruction generation for the auto-vectorizer 
> for AMD bulldozer 
> machines. This enablement gives additional ~3% improvement on polyhedron 2005 
> and cpu2006
> floating point programs.
> 
> The patch passed bootstrapping on a x86_64-unknown-linux-gnu system with 
> Bulldozer cores.
> 
> Is it OK to commit to trunk and backport to 4.6 branch?

For 4.6 branch, if it is approved for trunk, please wait after 4.6.1 is
released.

Jakub