Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-27 Thread Martin Liška

On 10/27/22 11:09, Mayshao-oc wrote:





Hi Martin:
    Thanks for your patch,  I comment the questions below.



Hi.



:)





Hello.



I noticed this patch set which is kind of related to 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364 
.



And I have a couple of questions:



1) I noticed you drop AVX and F16C features for the newly added "lujiazui". Why 
do you need it?
  I would expect these features would be properly detected by cpuid?


Yes, these features could be detected by cpuid, and in respect of 
functionality, these features are ok, but in respect of performance, these 
features need further improvement, so we decide to drop it now, and add these 
features back when performance meet  our expectation.



 I see. So theoretically you can increase costs of the corresponding insns and 
that could be dropped now?
 But I'm not a costing expert.


Hi.

One note: please try to send plain-text emails to GCC's mailing lists and not 
HTML version. Thanks!



I am new to gcc, and have lots of things to learn. About LTO and PGO, I have 
read some knowledge you and hubicka shared, and it helps me a lot, As a 
performance issue, it is a good idea to use cost model to solve, and disable 
avx entirely seems overkill. But cost model need to set the appropriate value 
of the cost, it's challenging to specify the number and more challenging to 
justify why we set that number. Our current approach have a pitfall to 
accommodate AVX intrinsic functions(eg: __mm256_loadu_pd), we could use -mavx 
to specify this explictly to overcome this.


Sure, makes sense.

Martin






2) If you really need it, can you please test for me the attached patch? It 
should come up
  with a new function.


I have tested the patch, It's ok. 



 Good, I'm going to install it.





3) Have question about:



else if (vendor == signature_CENTAUR_ebx && family < 0x07)
    cpu_model->__cpu_vendor = VENDOR_CENTAUR;
else if (vendor == signature_SHANGHAI_ebx
               || vendor == signature_CENTAUR_ebx)



Are there any signature_CENTAUR_ebx models with family == 0x7 ?
Similarly, are there any signature_SHANGHAI_ebx modes with family < 0x7 ?


Yes, both cases exist in our products.



 Good. Then we miss a CPU features detection for (vendor == signature_CENTAUR_ebx 
&& family < 0x07)
 aka https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364 
. But it's not worth it as 
it's a legacy hardware,
 right?


Yes, for legacy hardware, we need to keep it work correctly, but in respect of 
performance, we don't spend a lot of time to tune.


 Cheers,
 Martin





Thanks,

Martin

BR 
Mayshao






Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-27 Thread Mayshao-oc




>>
>> Hi Martin:
>> Thanks for your patch,  I comment the questions below.

>Hi.

>:)

>>
>>> Hello.
>>
>>> I noticed this patch set which is kind of related to 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364.
>>
>>> And I have a couple of questions:
>>
>>>1) I noticed you drop AVX and F16C features for the newly added "lujiazui". 
>>>Why do you need it?
>>>  I would expect these features would be properly detected by cpuid?
>>
>> Yes, these features could be detected by cpuid, and in respect of 
>> functionality, these features are ok, but in respect of performance, these 
>> features need further improvement, so we decide to drop it now, and add 
>> these features back when performance meet our expectation.

> I see. So theoretically you can increase costs of the corresponding insns and 
> that could be dropped now?
> But I'm not a costing expert.

I am new to gcc, and have lots of things to learn. About LTO and PGO, I have 
read some knowledge you and hubicka shared, and it helps me a lot, As a 
performance issue, it is a good idea to use cost model to solve, and disable 
avx entirely seems overkill. But cost model need to set the appropriate value 
of the cost, it's challenging to specify the number and more challenging to 
justify why we set that number. Our current approach have a pitfall to 
accommodate AVX intrinsic functions(eg: __mm256_loadu_pd), we could use -mavx 
to specify this explictly to overcome this.

>>
>>> 2) If you really need it, can you please test for me the attached patch? It 
>>> should come up
>>>  with a new function.
>>
>> I have tested the patch, It's ok.

> Good, I'm going to install it.

>>
>>> 3) Have question about:
>>
>>> else if (vendor == signature_CENTAUR_ebx && family < 0x07)
>>>cpu_model->__cpu_vendor = VENDOR_CENTAUR;
>>> else if (vendor == signature_SHANGHAI_ebx
>>>   || vendor == signature_CENTAUR_ebx)
>>
>>> Are there any signature_CENTAUR_ebx models with family == 0x7 ?
>>> Similarly, are there any signature_SHANGHAI_ebx modes with family < 0x7 ?
>>
>> Yes, both cases exist in our products.

> Good. Then we miss a CPU features detection for (vendor == 
> signature_CENTAUR_ebx && family < 0x07)
> aka https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364. But it's not worth 
> it as it's a legacy hardware,
> right?

Yes, for legacy hardware, we need to keep it work correctly, but in respect of 
performance, we don't spend a lot of time to tune.

> Cheers,
> Martin

>>
>>> Thanks,
>> Martin
>>
>> BR
>> Mayshao



Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-26 Thread Martin Liška
On 10/26/22 11:06, Mayshao-oc wrote:
> 
> Hi Martin:
>     Thanks for your patch,  I comment the questions below.

Hi.

:)

> 
>> Hello.
> 
>> I noticed this patch set which is kind of related to 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364.
> 
>> And I have a couple of questions:
> 
>>1) I noticed you drop AVX and F16C features for the newly added "lujiazui". 
>>Why do you need it?
>>  I would expect these features would be properly detected by cpuid?
> 
> Yes, these features could be detected by cpuid, and in respect of 
> functionality, these features are ok, but in respect of performance, these 
> features need further improvement, so we decide to drop it now, and add these 
> features back when performance meet our expectation.

I see. So theoretically you can increase costs of the corresponding insns and 
that could be dropped now?
But I'm not a costing expert.

> 
>> 2) If you really need it, can you please test for me the attached patch? It 
>> should come up
>>  with a new function.
> 
> I have tested the patch, It's ok. 

Good, I'm going to install it.

> 
>> 3) Have question about:
> 
>> else if (vendor == signature_CENTAUR_ebx && family < 0x07)
>>    cpu_model->__cpu_vendor = VENDOR_CENTAUR;
>> else if (vendor == signature_SHANGHAI_ebx
>>               || vendor == signature_CENTAUR_ebx)
> 
>> Are there any signature_CENTAUR_ebx models with family == 0x7 ?
>> Similarly, are there any signature_SHANGHAI_ebx modes with family < 0x7 ?
> 
> Yes, both cases exist in our products.

Good. Then we miss a CPU features detection for (vendor == 
signature_CENTAUR_ebx && family < 0x07)
aka https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364. But it's not worth it 
as it's a legacy hardware,
right?

Cheers,
Martin

> 
>> Thanks,
>> Martin
> 
> BR 
> Mayshao



Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-26 Thread Mayshao-oc


Hi Martin:
Thanks for your patch,  I comment the questions below.

> Hello.

> I noticed this patch set which is kind of related to 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364.

> And I have a couple of questions:

>1) I noticed you drop AVX and F16C features for the newly added "lujiazui". 
>Why do you need it?
>  I would expect these features would be properly detected by cpuid?

Yes, these features could be detected by cpuid, and in respect of 
functionality, these features are ok, but in respect of performance, these 
features need further improvement, so we decide to drop it now, and add these 
features back when performance meet our expectation.

> 2) If you really need it, can you please test for me the attached patch? It 
> should come up
>  with a new function.

I have tested the patch, It's ok.

> 3) Have question about:

> else if (vendor == signature_CENTAUR_ebx && family < 0x07)
>cpu_model->__cpu_vendor = VENDOR_CENTAUR;
> else if (vendor == signature_SHANGHAI_ebx
>   || vendor == signature_CENTAUR_ebx)

> Are there any signature_CENTAUR_ebx models with family == 0x7 ?
> Similarly, are there any signature_SHANGHAI_ebx modes with family < 0x7 ?

Yes, both cases exist in our products.

> Thanks,
> Martin

BR
Mayshao


Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-24 Thread Martin Liška
Hello.

I noticed this patch set which is kind of related to 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364.

And I have a couple of questions:

1) I noticed you drop AVX and F16C features for the newly added "lujiazui". Why 
do you need it?
   I would expect these features would be properly detected by cpuid?

2) If you really need it, can you please test for me the attached patch? It 
should come up
   with a new function.

3) Have question about:

  else if (vendor == signature_CENTAUR_ebx && family < 0x07)
cpu_model->__cpu_vendor = VENDOR_CENTAUR;
  else if (vendor == signature_SHANGHAI_ebx
|| vendor == signature_CENTAUR_ebx)

Are there any signature_CENTAUR_ebx models with family == 0x7 ?
Similarly, are there any signature_SHANGHAI_ebx modes with family < 0x7 ?

Thanks,
MartinFrom fa0bd99da8fd92b15a2cee55737a5962657da212 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 25 Oct 2022 06:28:44 +0200
Subject: [PATCH] i386: add reset_cpu_feature

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (has_cpu_feature): Add comment.
	(reset_cpu_feature): New.
	(get_zhaoxin_cpu): Use reset_cpu_feature.
---
 gcc/common/config/i386/cpuinfo.h | 38 +++-
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index d45451c5704..19ea7132fd5 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -76,6 +76,8 @@ has_cpu_feature (struct __processor_model *cpu_model,
 }
 }
 
+/* Save FEATURE to either CPU_MODEL or CPU_FEATURES2.  */
+
 static inline void
 set_cpu_feature (struct __processor_model *cpu_model,
 		 unsigned int *cpu_features2,
@@ -100,6 +102,32 @@ set_cpu_feature (struct __processor_model *cpu_model,
 }
 }
 
+/* Drop FEATURE from either CPU_MODEL or CPU_FEATURES2.  */
+
+static inline void
+reset_cpu_feature (struct __processor_model *cpu_model,
+		   unsigned int *cpu_features2,
+		   enum processor_features feature)
+{
+  unsigned index, offset;
+  unsigned f = feature;
+
+  if (f < 32)
+{
+  /* The first 32 features.  */
+  cpu_model->__cpu_features[0] &= ~(1U << f);
+}
+  else
+{
+  /* The rest of features.  cpu_features2[i] contains features from
+	 (32 + i * 32) to (31 + 32 + i * 32), inclusively.  */
+  f -= 32;
+  index = f / 32;
+  offset = f % 32;
+  cpu_features2[index] &= ~(1U << offset);
+}
+}
+
 /* Get the specific type of AMD CPU and return AMD CPU name.  Return
NULL for unknown AMD CPU.  */
 
@@ -565,11 +593,11 @@ get_zhaoxin_cpu (struct __processor_model *cpu_model,
   cpu_model->__cpu_type = ZHAOXIN_FAM7H;
   if (model == 0x3b)
 	{
-	cpu = "lujiazui";
-	CHECK___builtin_cpu_is ("lujiazui");
-	cpu_model->__cpu_features[0] &= ~(1U <<(FEATURE_AVX & 31));
-	cpu_features2[0] &= ~(1U <<((FEATURE_F16C - 32) & 31));
-	cpu_model->__cpu_subtype = ZHAOXIN_FAM7H_LUJIAZUI;
+	  cpu = "lujiazui";
+	  CHECK___builtin_cpu_is ("lujiazui");
+	  reset_cpu_feature (cpu_model, cpu_features2, FEATURE_AVX);
+	  reset_cpu_feature (cpu_model, cpu_features2, FEATURE_F16C);
+	  cpu_model->__cpu_subtype = ZHAOXIN_FAM7H_LUJIAZUI;
 	}
   break;
 default:
-- 
2.38.0



Re: [PATCH] [x86_64]: Zhaoxin lujiazui enablement

2022-05-22 Thread Uros Bizjak via Gcc-patches
On Tue, May 17, 2022 at 11:34 AM Mayshao-oc  wrote:
>
> > On Tue, May 17, 2022 at 5:15 AM mayshao  wrote:
> >> Hi Uros:
> >> This patch fix Zhaoxin CPU vendor ID detection problem and add 
> >> zhaoxin "lujiazui" processor support.
> >> Currently gcc can't recognize Zhaoxin CPU(vendor ID "CentaurHauls" 
> >> and "Shanghai") if user use -march=native option, which is confusing for 
> >> users.
> >> This patch enables -march=native in zhaoxin family 7th processor 
> >> and -march/-mtune=lujiazui, costs and tunning are set according to the 
> >> characteristics of the processor.We add a new md file to describe lujiazui 
> >> pipeline.
> >> Testing:
> >> Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
> >> Ok for master?
> >> Background:
> >> Related Zhaoxin linux kernel patch can be found at:
> >> https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bd...@zhaoxin.com/
> >> Related Zhaoxin glibc patch can be found at:
> >> https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193
> >> gcc/ChangeLog:
> > The entries below are suspiciously empty - please fill in the details.
>
> Sorry for forgetting this. Update patch. Thanks.
>
> * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Detect
> the specific type of Zhaoxin CPU, and return Zhaoxin CPU name.
> (cpu_indicator_init): Handle Zhaoxin processors.
> * common/config/i386/i386-common.cc: Add lujiazui.
> * common/config/i386/i386-cpuinfo.h (enum processor_vendor): Add
> VENDOR_ZHAOXIN.
> (enum processor_types): Add ZHAOXIN_FAM7H.
> (enum processor_subtypes): Add ZHAOXIN_FAM7H_LUJIAZUI.
> * config.gcc: Add lujiazui.
> * config/i386/cpuid.h (signature_SHANGHAI_ebx): Add
> Signatures for zhaoxin
> (signature_SHANGHAI_ecx): Ditto.
> (signature_SHANGHAI_edx): Ditto.
> * config/i386/driver-i386.cc (host_detect_local_cpu): Let
> -march=native recognize lujiazui processors.
> * config/i386/i386-c.cc (ix86_target_macros_internal): Add lujiazui.
> * config/i386/i386-options.cc (m_LUJIAZUI): New_definition.
> * config/i386/i386.h (enum processor_type): Ditto.
> * config/i386/i386.md: Add lujiazui.
> * config/i386/x86-tune-costs.h (struct processor_costs): Add
> lujiazui costs.
> * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add lujiazui.
> (ix86_adjust_cost): Ditto.
> * config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Add lujiazui tunnings.
> (X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto.
> (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto.
> (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto.
> (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto.
> (X86_TUNE_MOVX): Ditto.
> (X86_TUNE_MEMORY_MISMATCH_STALL): Ditto.
> (X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto.
> (X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto.
> (X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto.
> (X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto.
> (X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto.
> (X86_TUNE_USE_LEAVE): Ditto.
> (X86_TUNE_PUSH_MEMORY): Ditto.
> (X86_TUNE_LCP_STALL): Ditto.
> (X86_TUNE_USE_INCDEC): Ditto.
> (X86_TUNE_INTEGER_DFMODE_MOVES): Ditto.
> (X86_TUNE_OPT_AGU): Ditto.
> (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto.
> (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto.
> (X86_TUNE_USE_SAHF): Ditto.
> (X86_TUNE_USE_BT): Ditto.
> (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto.
> (X86_TUNE_ONE_IF_CONV_INSN): Ditto.
> (X86_TUNE_AVOID_MFENCE): Ditto.
> (X86_TUNE_EXPAND_ABS): Ditto.
> (X86_TUNE_USE_SIMODE_FIOP): Ditto.
> (X86_TUNE_USE_FFREEP): Ditto.
> (X86_TUNE_EXT_80387_CONSTANTS): Ditto.
> (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto.
> (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto.
> (X86_TUNE_SSE_TYPELESS_STORES): Ditto.
> (X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto.
> * doc/extend.texi: Add details about lujiazui.
> * doc/invoke.texi: Add details about lujiazui.
> * config/i386/lujiazui.md: Introduce lujiazui cpu and include new md file.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/funcspec-56.inc: Test -arch=lujiauzi and -tune=lujiazui.
> * g++.target/i386/mv32.C: Ditto.

The patch looks good to me (However, I didn't review the .md file in details).

BTW: The approval applies to the technical part, the legal part should
be handled by someone else - if it is still needed after the project
was changed from copyright assignment style to the developer's
certificate of origin style.

Also, can you handle the commit by yourself, or should I do it for you?

Thanks,
Uros.

>
> >> * common/config/i386/cpuinfo.h (get_zhaoxin_cpu):
> >> (cpu_indicator_init):
> >> * common/config/i386/i386-common.cc:
> >> * common/config/i386/i386-cpuinfo.h (enum processor_vendor):
> >> (enum processor_types):
> >> (enum processor_subtypes):
> >> * config.gcc:
> >> * config/i386/cpuid.h (signature_SHANGHAI_ebx):
> >> (signature_SHANGHAI_ecx):
> >> (signature_SHANGHAI_edx):
> >> * config/i386/driver-i386.cc (host_detect_local_cpu):
> >> * config/i386/i386-c.cc (ix86_target_

Re: [PATCH] [x86_64]: Zhaoxin lujiazui enablement

2022-05-17 Thread Mayshao-oc
> On Tue, May 17, 2022 at 5:15 AM mayshao  wrote:
>> Hi Uros:
>> This patch fix Zhaoxin CPU vendor ID detection problem and add 
>> zhaoxin "lujiazui" processor support.
>> Currently gcc can't recognize Zhaoxin CPU(vendor ID "CentaurHauls" 
>> and "Shanghai") if user use -march=native option, which is confusing for 
>> users.
>> This patch enables -march=native in zhaoxin family 7th processor and 
>> -march/-mtune=lujiazui, costs and tunning are set according to the 
>> characteristics of the processor.We add a new md file to describe lujiazui 
>> pipeline.
>> Testing:
>> Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
>> Ok for master?
>> Background:
>> Related Zhaoxin linux kernel patch can be found at:
>> https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bd...@zhaoxin.com/
>> Related Zhaoxin glibc patch can be found at:
>> https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193
>> gcc/ChangeLog:
> The entries below are suspiciously empty - please fill in the details.

Sorry for forgetting this. Update patch. Thanks.

* common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Detect
the specific type of Zhaoxin CPU, and return Zhaoxin CPU name.
(cpu_indicator_init): Handle Zhaoxin processors.
* common/config/i386/i386-common.cc: Add lujiazui.
* common/config/i386/i386-cpuinfo.h (enum processor_vendor): Add
VENDOR_ZHAOXIN.
(enum processor_types): Add ZHAOXIN_FAM7H.
(enum processor_subtypes): Add ZHAOXIN_FAM7H_LUJIAZUI.
* config.gcc: Add lujiazui.
* config/i386/cpuid.h (signature_SHANGHAI_ebx): Add
Signatures for zhaoxin
(signature_SHANGHAI_ecx): Ditto.
(signature_SHANGHAI_edx): Ditto.
* config/i386/driver-i386.cc (host_detect_local_cpu): Let
-march=native recognize lujiazui processors.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add lujiazui.
* config/i386/i386-options.cc (m_LUJIAZUI): New_definition.
* config/i386/i386.h (enum processor_type): Ditto.
* config/i386/i386.md: Add lujiazui.
* config/i386/x86-tune-costs.h (struct processor_costs): Add
lujiazui costs.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add lujiazui.
(ix86_adjust_cost): Ditto.
* config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Add lujiazui tunnings.
(X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto.
(X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto.
(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto.
(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto.
(X86_TUNE_MOVX): Ditto.
(X86_TUNE_MEMORY_MISMATCH_STALL): Ditto.
(X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto.
(X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto.
(X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto.
(X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto.
(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto.
(X86_TUNE_USE_LEAVE): Ditto.
(X86_TUNE_PUSH_MEMORY): Ditto.
(X86_TUNE_LCP_STALL): Ditto.
(X86_TUNE_USE_INCDEC): Ditto.
(X86_TUNE_INTEGER_DFMODE_MOVES): Ditto.
(X86_TUNE_OPT_AGU): Ditto.
(X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto.
(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto.
(X86_TUNE_USE_SAHF): Ditto.
(X86_TUNE_USE_BT): Ditto.
(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto.
(X86_TUNE_ONE_IF_CONV_INSN): Ditto.
(X86_TUNE_AVOID_MFENCE): Ditto.
(X86_TUNE_EXPAND_ABS): Ditto.
(X86_TUNE_USE_SIMODE_FIOP): Ditto.
(X86_TUNE_USE_FFREEP): Ditto.
(X86_TUNE_EXT_80387_CONSTANTS): Ditto.
(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto.
(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto.
(X86_TUNE_SSE_TYPELESS_STORES): Ditto.
(X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto.
* doc/extend.texi: Add details about lujiazui.
* doc/invoke.texi: Add details about lujiazui.
* config/i386/lujiazui.md: Introduce lujiazui cpu and include new md file.

gcc/testsuite/ChangeLog:

* gcc.target/i386/funcspec-56.inc: Test -arch=lujiauzi and -tune=lujiazui.
* g++.target/i386/mv32.C: Ditto.

>> * common/config/i386/cpuinfo.h (get_zhaoxin_cpu):
>> (cpu_indicator_init):
>> * common/config/i386/i386-common.cc:
>> * common/config/i386/i386-cpuinfo.h (enum processor_vendor):
>> (enum processor_types):
>> (enum processor_subtypes):
>> * config.gcc:
>> * config/i386/cpuid.h (signature_SHANGHAI_ebx):
>> (signature_SHANGHAI_ecx):
>> (signature_SHANGHAI_edx):
>> * config/i386/driver-i386.cc (host_detect_local_cpu):
>> * config/i386/i386-c.cc (ix86_target_macros_internal):
>> * config/i386/i386-options.cc (m_LUJIAZUI):
>> * config/i386/i386.h (enum processor_type):
>> * config/i386/i386.md:
>> * config/i386/x86-tune-costs.h (struct processor_costs):
>> * config/i386/x86-tune-sched.cc (ix86_issue_rate):
>> (ix86_adjust_cost):
>> * config/i386/x86-tune.def (X86_TUNE_SCHEDULE):
>> (X86_TUNE_PARTIAL_REG_DEPENDENCY):
>> (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY):
>> (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY):
>> (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY):
>> (X86_TUNE_MOVX):
>> (X86_TUNE_MEMOR

Re: [PATCH] [x86_64]: Zhaoxin lujiazui enablement

2022-05-16 Thread Richard Biener via Gcc-patches
On Tue, May 17, 2022 at 5:15 AM mayshao  wrote:
>
> Hi Uros:
> This patch fix Zhaoxin CPU vendor ID detection problem and add 
> zhaoxin "lujiazui" processor support.
> Currently gcc can't recognize Zhaoxin CPU(vendor ID "CentaurHauls" 
> and "Shanghai") if user use -march=native option, which is confusing for 
> users.
> This patch enables -march=native in zhaoxin family 7th processor and 
> -march/-mtune=lujiazui, costs and tunning are set according to the 
> characteristics of the processor.We add a new md file to describe lujiazui 
> pipeline.
>
> Testing:
> Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
>
> Ok for master?
>
> Background:
> Related Zhaoxin linux kernel patch can be found at:
> https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bd...@zhaoxin.com/
>
> Related Zhaoxin glibc patch can be found at:
> https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193
>
> gcc/ChangeLog:

The entries below are suspiciously empty - please fill in the details.

> * common/config/i386/cpuinfo.h (get_zhaoxin_cpu):
> (cpu_indicator_init):
> * common/config/i386/i386-common.cc:
> * common/config/i386/i386-cpuinfo.h (enum processor_vendor):
> (enum processor_types):
> (enum processor_subtypes):
> * config.gcc:
> * config/i386/cpuid.h (signature_SHANGHAI_ebx):
> (signature_SHANGHAI_ecx):
> (signature_SHANGHAI_edx):
> * config/i386/driver-i386.cc (host_detect_local_cpu):
> * config/i386/i386-c.cc (ix86_target_macros_internal):
> * config/i386/i386-options.cc (m_LUJIAZUI):
> * config/i386/i386.h (enum processor_type):
> * config/i386/i386.md:
> * config/i386/x86-tune-costs.h (struct processor_costs):
> * config/i386/x86-tune-sched.cc (ix86_issue_rate):
> (ix86_adjust_cost):
> * config/i386/x86-tune.def (X86_TUNE_SCHEDULE):
> (X86_TUNE_PARTIAL_REG_DEPENDENCY):
> (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY):
> (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY):
> (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY):
> (X86_TUNE_MOVX):
> (X86_TUNE_MEMORY_MISMATCH_STALL):
> (X86_TUNE_FUSE_CMP_AND_BRANCH_32):
> (X86_TUNE_FUSE_CMP_AND_BRANCH_64):
> (X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS):
> (X86_TUNE_FUSE_ALU_AND_BRANCH):
> (X86_TUNE_ACCUMULATE_OUTGOING_ARGS):
> (X86_TUNE_USE_LEAVE):
> (X86_TUNE_PUSH_MEMORY):
> (X86_TUNE_LCP_STALL):
> (X86_TUNE_USE_INCDEC):
> (X86_TUNE_INTEGER_DFMODE_MOVES):
> (X86_TUNE_OPT_AGU):
> (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
> (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES):
> (X86_TUNE_USE_SAHF):
> (X86_TUNE_USE_BT):
> (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI):
> (X86_TUNE_ONE_IF_CONV_INSN):
> (X86_TUNE_AVOID_MFENCE):
> (X86_TUNE_EXPAND_ABS):
> (X86_TUNE_USE_SIMODE_FIOP):
> (X86_TUNE_USE_FFREEP):
> (X86_TUNE_EXT_80387_CONSTANTS):
> (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL):
> (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL):
> (X86_TUNE_SSE_TYPELESS_STORES):
> (X86_TUNE_SSE_LOAD0_BY_PXOR):
> * doc/extend.texi:
> * doc/invoke.texi:
> * config/i386/lujiazui.md: New file.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/funcspec-56.inc:
> * g++.target/i386/mv32.C: New test.
> ---
>  gcc/common/config/i386/cpuinfo.h  |  54 +-
>  gcc/common/config/i386/i386-common.cc |   8 +
>  gcc/common/config/i386/i386-cpuinfo.h |   3 +
>  gcc/config.gcc|  10 +-
>  gcc/config/i386/cpuid.h   |   4 +
>  gcc/config/i386/driver-i386.cc|  20 +-
>  gcc/config/i386/i386-c.cc |   7 +
>  gcc/config/i386/i386-options.cc   |   3 +
>  gcc/config/i386/i386.h|   1 +
>  gcc/config/i386/i386.md   |   5 +-
>  gcc/config/i386/lujiazui.md   | 844 ++
>  gcc/config/i386/x86-tune-costs.h  | 115 +++
>  gcc/config/i386/x86-tune-sched.cc |   2 +
>  gcc/config/i386/x86-tune.def  |  89 +-
>  gcc/doc/extend.texi   |   3 +
>  gcc/doc/invoke.texi   |   5 +
>  gcc/testsuite/g++.target/i386/mv32.C  |  31 +
>  gcc/testsuite/gcc.target/i386/funcspec-56.inc |   2 +
>  18 files changed, 1159 insertions(+), 47 deletions(-)
>  create mode 100644 gcc/config/i386/lujiazui.md
>  create mode 100644 gcc/testsuite/g++.target/i386/mv32.C
>
> diff --git a/gcc/common/config/i386/cpuinfo.h 
> b/gcc/common/config/i386/cpuinfo.h
> index 6d6171f4555..adc02bc3d98 100644
> --- a/gcc/common/config/i386/cpuinfo.h
> +++ b/gcc/common/config/i386/cpuinfo.h
> @@ -526,6 +526,39 @@ get_i

[PATCH] [x86_64]: Zhaoxin lujiazui enablement

2022-05-16 Thread mayshao
Hi Uros:
This patch fix Zhaoxin CPU vendor ID detection problem and add zhaoxin 
"lujiazui" processor support.
Currently gcc can't recognize Zhaoxin CPU(vendor ID "CentaurHauls" and 
"Shanghai") if user use -march=native option, which is confusing for users.
This patch enables -march=native in zhaoxin family 7th processor and 
-march/-mtune=lujiazui, costs and tunning are set according to the 
characteristics of the processor.We add a new md file to describe lujiazui 
pipeline.

Testing:
Bootstrap is ok, and no regressions for i386/x86-64 testsuite.

Ok for master?

Background:
Related Zhaoxin linux kernel patch can be found at:
https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bd...@zhaoxin.com/

Related Zhaoxin glibc patch can be found at:
https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_zhaoxin_cpu):
(cpu_indicator_init):
* common/config/i386/i386-common.cc:
* common/config/i386/i386-cpuinfo.h (enum processor_vendor):
(enum processor_types):
(enum processor_subtypes):
* config.gcc:
* config/i386/cpuid.h (signature_SHANGHAI_ebx):
(signature_SHANGHAI_ecx):
(signature_SHANGHAI_edx):
* config/i386/driver-i386.cc (host_detect_local_cpu):
* config/i386/i386-c.cc (ix86_target_macros_internal):
* config/i386/i386-options.cc (m_LUJIAZUI):
* config/i386/i386.h (enum processor_type):
* config/i386/i386.md:
* config/i386/x86-tune-costs.h (struct processor_costs):
* config/i386/x86-tune-sched.cc (ix86_issue_rate):
(ix86_adjust_cost):
* config/i386/x86-tune.def (X86_TUNE_SCHEDULE):
(X86_TUNE_PARTIAL_REG_DEPENDENCY):
(X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY):
(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY):
(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY):
(X86_TUNE_MOVX):
(X86_TUNE_MEMORY_MISMATCH_STALL):
(X86_TUNE_FUSE_CMP_AND_BRANCH_32):
(X86_TUNE_FUSE_CMP_AND_BRANCH_64):
(X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS):
(X86_TUNE_FUSE_ALU_AND_BRANCH):
(X86_TUNE_ACCUMULATE_OUTGOING_ARGS):
(X86_TUNE_USE_LEAVE):
(X86_TUNE_PUSH_MEMORY):
(X86_TUNE_LCP_STALL):
(X86_TUNE_USE_INCDEC):
(X86_TUNE_INTEGER_DFMODE_MOVES):
(X86_TUNE_OPT_AGU):
(X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES):
(X86_TUNE_USE_SAHF):
(X86_TUNE_USE_BT):
(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI):
(X86_TUNE_ONE_IF_CONV_INSN):
(X86_TUNE_AVOID_MFENCE):
(X86_TUNE_EXPAND_ABS):
(X86_TUNE_USE_SIMODE_FIOP):
(X86_TUNE_USE_FFREEP):
(X86_TUNE_EXT_80387_CONSTANTS):
(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL):
(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL):
(X86_TUNE_SSE_TYPELESS_STORES):
(X86_TUNE_SSE_LOAD0_BY_PXOR):
* doc/extend.texi:
* doc/invoke.texi:
* config/i386/lujiazui.md: New file.

gcc/testsuite/ChangeLog:

* gcc.target/i386/funcspec-56.inc:
* g++.target/i386/mv32.C: New test.
---
 gcc/common/config/i386/cpuinfo.h  |  54 +-
 gcc/common/config/i386/i386-common.cc |   8 +
 gcc/common/config/i386/i386-cpuinfo.h |   3 +
 gcc/config.gcc|  10 +-
 gcc/config/i386/cpuid.h   |   4 +
 gcc/config/i386/driver-i386.cc|  20 +-
 gcc/config/i386/i386-c.cc |   7 +
 gcc/config/i386/i386-options.cc   |   3 +
 gcc/config/i386/i386.h|   1 +
 gcc/config/i386/i386.md   |   5 +-
 gcc/config/i386/lujiazui.md   | 844 ++
 gcc/config/i386/x86-tune-costs.h  | 115 +++
 gcc/config/i386/x86-tune-sched.cc |   2 +
 gcc/config/i386/x86-tune.def  |  89 +-
 gcc/doc/extend.texi   |   3 +
 gcc/doc/invoke.texi   |   5 +
 gcc/testsuite/g++.target/i386/mv32.C  |  31 +
 gcc/testsuite/gcc.target/i386/funcspec-56.inc |   2 +
 18 files changed, 1159 insertions(+), 47 deletions(-)
 create mode 100644 gcc/config/i386/lujiazui.md
 create mode 100644 gcc/testsuite/g++.target/i386/mv32.C

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 6d6171f4555..adc02bc3d98 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -526,6 +526,39 @@ get_intel_cpu (struct __processor_model *cpu_model,
   return cpu;
 }
 
+/* Get the specific type of ZHAOXIN CPU and return ZHAOXIN CPU name.
+   Return NULL for unknown ZHAOXIN CPU.  */
+
+static inline const char *
+get_zhaoxin_cpu (struct __processor_model *cpu_model,
+   struct __processor_model2 *cpu_model2,
+   un

Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-03-28 Thread Mayshao-oc
On Sun, Mar 27, 2022 at 5:15 PM Uros Bizjak  wrote:
> On Fri, Mar 25, 2022 at 3:08 AM MayShao  wrote:
> >
> > Hi Uros,
> >
> > This patch fix Zhaoxin CPU Vendor ID detection problem
> > and add Zhaoxin "lujiazui" processor support and tuning.
> >
> > Currently gcc can't recognize Zhaoxin CPU (Vendor ID "CentaurHauls" and 
> > "Shanghai")
> > and wrongly identify Zhaoxin "lujiazui" as Intel core2 or i386, which is 
> > confusing for users.
> >
> > This patch enables -march/-mtune=lujiazui. Lujiazui is Zhaonxin family 7th 
> > processor.
> > Costs and tunings are set according to the characteristics of the processor.
> > We add a new md file to describe lujiazui pipeline.
> >
> > Testing :
> > Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
> >
> > OK for master?
>
> This patch is not a bugfix, so it will have to wait for a next stage 1
> to reopen.
>
> Uros.
>
Yes, Thanks for your reminder.
Then please help to review this patch again
when the next stage 1 reopen.
I have ever contributed to glibc before, should I need to
re-sign the FSF copyright assignment for this patch?


May
> >
> > Background:
> > Related Zhaoxin linux kernel patch can be found at:
> >  https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bd...@zhaoxin.com/
> >
> > Related Zhaoxin glibc patch can be found at:
> >  
> > https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193
> >
> > gcc/ChangeLog:
> >
> >* common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Detect
> >the cpu type of ZHAOXIN processors.
> >(cpu_indicator_init): Handle ZHAOXIN processors.
> >* common/config/i386/i386-common.cc: Add lujiazui.
> >* common/config/i386/i386-cpuinfo.h (enum processor_vendor): Add
> >VENDOR_ZHAOXIN.
> >(enum processor_types): Add ZHAOXIN_FAM7H.
> >(enum processor_subtypes):Add ZHAOXIN_FAM7H_LUJIAZUI.
> >* config.gcc: Add -march=lujiazui.
> >* config/i386/cpuid.h (signature_SHANGHAI_ebx): New definition
> >for ZHAOXIN.
> >(signature_SHANGHAI_ecx): Likewise.
> >(signature_SHANGHAI_edx): Likewise.
> >* config/i386/driver-i386.cc (host_detect_local_cpu): Let
> >-march=native recognize lujiazui processor.
> >* config/i386/i386-c.cc (ix86_target_macros_internal): Add
> >lujiazui def_or_undef.
> >* config/i386/i386-options.cc (m_LUJIAZUI): New definition.
> >* config/i386/i386.h (enum processor_type): Add PROCESSOR_LUJIAZUI.
> >* config/i386/i386.md: Add lujiazui cpu and include new md file.
> >* config/i386/x86-tune-costs.h (struct processor_costs): Add
> >lujiazui_cost.
> >* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add lujiazui.
> >(ix86_adjust_cost): Likewise.
> >* config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Enable for lujiazui.
> >(X86_TUNE_PARTIAL_REG_DEPENDENCY): Likewise.
> >(X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Likewise.
> >(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Likewise.
> >(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
> >(X86_TUNE_MOVX): Likewise.
> >(X86_TUNE_MEMORY_MISMATCH_STALL): Likewise.
> >(X86_TUNE_FUSE_CMP_AND_BRANCH_32): Likewise.
> >(X86_TUNE_FUSE_CMP_AND_BRANCH_64): Likewise.
> >(X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Likewise.
> >(X86_TUNE_FUSE_ALU_AND_BRANCH): Likewise.
> >(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Likewise.
> >(X86_TUNE_USE_LEAVE): Likewise.
> >(X86_TUNE_PUSH_MEMORY): Likewise.
> >(X86_TUNE_LCP_STALL): Likewise.
> >(X86_TUNE_USE_INCDEC): Likewise.
> >(X86_TUNE_INTEGER_DFMODE_MOVES): Likewise.
> >(X86_TUNE_OPT_AGU): Likewise.
> >(X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Likewise.
> >(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Likewise.
> >(X86_TUNE_USE_SAHF): Likewise.
> >(X86_TUNE_USE_BT): Likewise.
> >(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Likewise.
> >(X86_TUNE_ONE_IF_CONV_INSN): Likewise.
> >(X86_TUNE_AVOID_MFENCE): Likewise.
> >(X86_TUNE_EXPAND_ABS): Likewise.
> >(X86_TUNE_USE_SIMODE_FIOP): Likewise.
> >(X86_TUNE_USE_FFREEP): Likewise.
> >(X86_TUNE_EXT_80387_CONSTANTS): Likewise.
> >(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Likewise.
> >(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Likewise.
> >(X86_TUNE_SSE_TYPELESS_STORES): Likewise.
> >(X86_TUNE_SSE_LOAD0_BY_PXOR): Likewise.
> >(X86_TUNE_USE_GATHER): Likewise.
> >* doc/extend.texi: Add lujiazui.
> >* doc/invoke.texi: Add details about lujiazui.
> >* config/i386/lujiazui.md: New file for describing lujiazui pipeline.
> >
> > gcc/testsuite/ChangeLog:
> >
> >* gcc.target/i386/funcspec-56.inc: Handle new march.
> >* g++.target/i386/mv31.C: New test for -march=lujiazui.
> > ---
> >  gcc/common/c

Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-03-27 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 25, 2022 at 3:08 AM MayShao  wrote:
>
> Hi Uros,
>
> This patch fix Zhaoxin CPU Vendor ID detection problem
> and add Zhaoxin "lujiazui" processor support and tuning.
>
> Currently gcc can't recognize Zhaoxin CPU (Vendor ID "CentaurHauls" and 
> "Shanghai")
> and wrongly identify Zhaoxin "lujiazui" as Intel core2 or i386, which is 
> confusing for users.
>
> This patch enables -march/-mtune=lujiazui. Lujiazui is Zhaonxin family 7th 
> processor.
> Costs and tunings are set according to the characteristics of the processor.
> We add a new md file to describe lujiazui pipeline.
>
> Testing :
> Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
>
> OK for master?

This patch is not a bugfix, so it will have to wait for a next stage 1
to reopen.

Uros.

>
> Background:
> Related Zhaoxin linux kernel patch can be found at:
> https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bd...@zhaoxin.com/
>
> Related Zhaoxin glibc patch can be found at:
> https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193
>
> gcc/ChangeLog:
>
>* common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Detect
>the cpu type of ZHAOXIN processors.
>(cpu_indicator_init): Handle ZHAOXIN processors.
>* common/config/i386/i386-common.cc: Add lujiazui.
>* common/config/i386/i386-cpuinfo.h (enum processor_vendor): Add
>VENDOR_ZHAOXIN.
>(enum processor_types): Add ZHAOXIN_FAM7H.
>(enum processor_subtypes):Add ZHAOXIN_FAM7H_LUJIAZUI.
>* config.gcc: Add -march=lujiazui.
>* config/i386/cpuid.h (signature_SHANGHAI_ebx): New definition
>for ZHAOXIN.
>(signature_SHANGHAI_ecx): Likewise.
>(signature_SHANGHAI_edx): Likewise.
>* config/i386/driver-i386.cc (host_detect_local_cpu): Let
>-march=native recognize lujiazui processor.
>* config/i386/i386-c.cc (ix86_target_macros_internal): Add
>lujiazui def_or_undef.
>* config/i386/i386-options.cc (m_LUJIAZUI): New definition.
>* config/i386/i386.h (enum processor_type): Add PROCESSOR_LUJIAZUI.
>* config/i386/i386.md: Add lujiazui cpu and include new md file.
>* config/i386/x86-tune-costs.h (struct processor_costs): Add
>lujiazui_cost.
>* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add lujiazui.
>(ix86_adjust_cost): Likewise.
>* config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Enable for lujiazui.
>(X86_TUNE_PARTIAL_REG_DEPENDENCY): Likewise.
>(X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Likewise.
>(X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Likewise.
>(X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
>(X86_TUNE_MOVX): Likewise.
>(X86_TUNE_MEMORY_MISMATCH_STALL): Likewise.
>(X86_TUNE_FUSE_CMP_AND_BRANCH_32): Likewise.
>(X86_TUNE_FUSE_CMP_AND_BRANCH_64): Likewise.
>(X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Likewise.
>(X86_TUNE_FUSE_ALU_AND_BRANCH): Likewise.
>(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Likewise.
>(X86_TUNE_USE_LEAVE): Likewise.
>(X86_TUNE_PUSH_MEMORY): Likewise.
>(X86_TUNE_LCP_STALL): Likewise.
>(X86_TUNE_USE_INCDEC): Likewise.
>(X86_TUNE_INTEGER_DFMODE_MOVES): Likewise.
>(X86_TUNE_OPT_AGU): Likewise.
>(X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Likewise.
>(X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Likewise.
>(X86_TUNE_USE_SAHF): Likewise.
>(X86_TUNE_USE_BT): Likewise.
>(X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Likewise.
>(X86_TUNE_ONE_IF_CONV_INSN): Likewise.
>(X86_TUNE_AVOID_MFENCE): Likewise.
>(X86_TUNE_EXPAND_ABS): Likewise.
>(X86_TUNE_USE_SIMODE_FIOP): Likewise.
>(X86_TUNE_USE_FFREEP): Likewise.
>(X86_TUNE_EXT_80387_CONSTANTS): Likewise.
>(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Likewise.
>(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Likewise.
>(X86_TUNE_SSE_TYPELESS_STORES): Likewise.
>(X86_TUNE_SSE_LOAD0_BY_PXOR): Likewise.
>(X86_TUNE_USE_GATHER): Likewise.
>* doc/extend.texi: Add lujiazui.
>* doc/invoke.texi: Add details about lujiazui.
>* config/i386/lujiazui.md: New file for describing lujiazui pipeline.
>
> gcc/testsuite/ChangeLog:
>
>* gcc.target/i386/funcspec-56.inc: Handle new march.
>* g++.target/i386/mv31.C: New test for -march=lujiazui.
> ---
>  gcc/common/config/i386/cpuinfo.h  |  51 +-
>  gcc/common/config/i386/i386-common.cc |   9 +
>  gcc/common/config/i386/i386-cpuinfo.h |   3 +
>  gcc/config.gcc|  10 +-
>  gcc/config/i386/cpuid.h   |   4 +
>  gcc/config/i386/driver-i386.cc|  20 +-
>  gcc/config/i386/i386-c.cc |   7 +
>  gcc/config/i386/i386-options.cc   |   3 +
>  gcc/config/i386/i386.h   

[PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-03-24 Thread MayShao
Hi Uros,

This patch fix Zhaoxin CPU Vendor ID detection problem
and add Zhaoxin "lujiazui" processor support and tuning.

Currently gcc can't recognize Zhaoxin CPU (Vendor ID "CentaurHauls" and 
"Shanghai")
and wrongly identify Zhaoxin "lujiazui" as Intel core2 or i386, which is 
confusing for users.

This patch enables -march/-mtune=lujiazui. Lujiazui is Zhaonxin family 7th 
processor.
Costs and tunings are set according to the characteristics of the processor.
We add a new md file to describe lujiazui pipeline.

Testing :
Bootstrap is ok, and no regressions for i386/x86-64 testsuite.

OK for master?

Background:
Related Zhaoxin linux kernel patch can be found at:
https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bd...@zhaoxin.com/

Related Zhaoxin glibc patch can be found at:
https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193

gcc/ChangeLog:

   * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Detect
   the cpu type of ZHAOXIN processors.
   (cpu_indicator_init): Handle ZHAOXIN processors.
   * common/config/i386/i386-common.cc: Add lujiazui.
   * common/config/i386/i386-cpuinfo.h (enum processor_vendor): Add
   VENDOR_ZHAOXIN.
   (enum processor_types): Add ZHAOXIN_FAM7H.
   (enum processor_subtypes):Add ZHAOXIN_FAM7H_LUJIAZUI.
   * config.gcc: Add -march=lujiazui.
   * config/i386/cpuid.h (signature_SHANGHAI_ebx): New definition
   for ZHAOXIN.
   (signature_SHANGHAI_ecx): Likewise.
   (signature_SHANGHAI_edx): Likewise.
   * config/i386/driver-i386.cc (host_detect_local_cpu): Let
   -march=native recognize lujiazui processor.
   * config/i386/i386-c.cc (ix86_target_macros_internal): Add
   lujiazui def_or_undef.
   * config/i386/i386-options.cc (m_LUJIAZUI): New definition.
   * config/i386/i386.h (enum processor_type): Add PROCESSOR_LUJIAZUI.
   * config/i386/i386.md: Add lujiazui cpu and include new md file.
   * config/i386/x86-tune-costs.h (struct processor_costs): Add
   lujiazui_cost.
   * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add lujiazui.
   (ix86_adjust_cost): Likewise.
   * config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Enable for lujiazui.
   (X86_TUNE_PARTIAL_REG_DEPENDENCY): Likewise.
   (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Likewise.
   (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Likewise.
   (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Likewise.
   (X86_TUNE_MOVX): Likewise.
   (X86_TUNE_MEMORY_MISMATCH_STALL): Likewise.
   (X86_TUNE_FUSE_CMP_AND_BRANCH_32): Likewise.
   (X86_TUNE_FUSE_CMP_AND_BRANCH_64): Likewise.
   (X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Likewise.
   (X86_TUNE_FUSE_ALU_AND_BRANCH): Likewise.
   (X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Likewise.
   (X86_TUNE_USE_LEAVE): Likewise.
   (X86_TUNE_PUSH_MEMORY): Likewise.
   (X86_TUNE_LCP_STALL): Likewise.
   (X86_TUNE_USE_INCDEC): Likewise.
   (X86_TUNE_INTEGER_DFMODE_MOVES): Likewise.
   (X86_TUNE_OPT_AGU): Likewise.
   (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Likewise.
   (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Likewise.
   (X86_TUNE_USE_SAHF): Likewise.
   (X86_TUNE_USE_BT): Likewise.
   (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Likewise.
   (X86_TUNE_ONE_IF_CONV_INSN): Likewise.
   (X86_TUNE_AVOID_MFENCE): Likewise.
   (X86_TUNE_EXPAND_ABS): Likewise.
   (X86_TUNE_USE_SIMODE_FIOP): Likewise.
   (X86_TUNE_USE_FFREEP): Likewise.
   (X86_TUNE_EXT_80387_CONSTANTS): Likewise.
   (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Likewise.
   (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Likewise.
   (X86_TUNE_SSE_TYPELESS_STORES): Likewise.
   (X86_TUNE_SSE_LOAD0_BY_PXOR): Likewise.
   (X86_TUNE_USE_GATHER): Likewise.
   * doc/extend.texi: Add lujiazui.
   * doc/invoke.texi: Add details about lujiazui.
   * config/i386/lujiazui.md: New file for describing lujiazui pipeline.

gcc/testsuite/ChangeLog:

   * gcc.target/i386/funcspec-56.inc: Handle new march.
   * g++.target/i386/mv31.C: New test for -march=lujiazui.
---
 gcc/common/config/i386/cpuinfo.h  |  51 +-
 gcc/common/config/i386/i386-common.cc |   9 +
 gcc/common/config/i386/i386-cpuinfo.h |   3 +
 gcc/config.gcc|  10 +-
 gcc/config/i386/cpuid.h   |   4 +
 gcc/config/i386/driver-i386.cc|  20 +-
 gcc/config/i386/i386-c.cc |   7 +
 gcc/config/i386/i386-options.cc   |   3 +
 gcc/config/i386/i386.h|   1 +
 gcc/config/i386/i386.md   |   5 +-
 gcc/config/i386/lujiazui.md   | 844 ++
 gcc/config/i386/x86-tune-costs.h  | 115 +++
 gcc/config/i386/x86-tune-sched.cc |   2 +
 gcc/config/i386/x86-tune.def  |  91 +-
 gcc/doc/extend.texi