Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-07 Thread Mayshao-oc
> On Fri, Nov 8, 2024 at 10:21 AM Mayshao-oc wrote: > > > > -Original Message- > > > > From: Xi Ruoyao > > > > Sent: Thursday, November 7, 2024 1:12 PM > > > > To: Liu, Hongtao ; Mayshao-oc > > > o...@zhaoxin.com>; Hongtao

Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-07 Thread Mayshao-oc
> > -Original Message- > > From: Xi Ruoyao > > Sent: Thursday, November 7, 2024 1:12 PM > > To: Liu, Hongtao ; Mayshao-oc > o...@zhaoxin.com>; Hongtao Liu > > Cc: gcc-patches@gcc.gnu.org; hubi...@ucw.cz; ubiz...@gmail.com; > > richard.guent...@

Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-06 Thread Mayshao-oc
> > On Thu, Nov 7, 2024 at 10:29?AM MayShao-oc wrote: > > > > Hi all: > >For zhaoxin, I find no improvement when enable pass_align_tight_loops, > > and have performance drop in some cases. > >This patch add a new tunable to bypass pass_align_tight_loops

[PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-06 Thread MayShao-oc
Hi all: For zhaoxin, I find no improvement when enable pass_align_tight_loops, and have performance drop in some cases. This patch add a new tunable to bypass pass_align_tight_loops in zhaoxin. Bootstrapped X86_64. Ok for trunk? BR Mayshao gcc/ChangeLog: * config/i386/i386-fea

[PATCH] [x86_64] Add flag to control tight loops alignment opt

2024-11-04 Thread MayShao-oc
Hi all: This patch add -malign-tight-loops flag to control pass_align_tight_loops. The motivation is that pass_align_tight_loops may cause performance regression in nested loops. The example code as follows: #define ITER 2 #define ITER_O 10 int i, j,k; int array[I

[PATCH v2] [libatomic]: Handle AVX+CX16 ZHAOXIN like intel for 16b atomic [PR104688]

2024-07-18 Thread MayShao-oc
From: mayshao Hi Jakub: Thanks for your review,We should just amend this to handle Zhaoxin. Bootstrapped /regtested X86_64. Ok for trunk? BR Mayshao libatomic/ChangeLog: PR target/104688 * config/x86/init.c (__libat_feat1_init): Don't clear bit_AVX on ZHAO

[PATCH] [libatomic]: Handle AVX+CX16 ZHAOXIN like intel for 16b atomic [PR104688]

2024-07-11 Thread MayShao-oc
From: mayshao Hi all: We reply in PR104688 that ZHAOXIN guarantees that 16-byte VMOVDQA on 16-byte aligned address is atomic, if memory type of the address is WB. So there is no need to clear bit_AVX on ZHAOXIN CPUs. Bootstrapped /regtested X86_64. Ok for trunk? BR Mayshao libato

Re: [PATCH] [x86_64]: Zhaoxin shijidadao enablement

2024-06-18 Thread mayshao-oc
On 5/28/24 14:15, Uros Bizjak wrote: On Mon, May 27, 2024 at 10:33 AM MayShao wrote: From: mayshao Hi all: This patch enables -march/-mtune=shijidadao, costs and tunings are set according to the characteristics of the processor. Bootstrapped /regtested X86_64. Ok for

Re: [PATCH] invoke.texi: Clarify -march=lujiazui

2024-05-23 Thread mayshao-oc
Hi Jakub: I think the modified lujiazui description is what actually happens,thanks. BR Mayshao [这封邮件来自外部发件人 谨防风险] Hi! Yesterday I was searching which exact CPUs are affected by the PR114576 wrong-code issue and went from the PTA_* bitmasks in GCC, so arrived at the goldmont, goldmont-p

Re: [PATCH] [x86_64]: Zhaoxin yongfeng enablement

2023-10-30 Thread Mayshao-oc
* g++.target/i386/mv32.C: Handle new march. >> >> * gcc.target/i386/funcspec-56.inc: Ditto. >> > >> > LGTM. >> > >> > There are a couple of comments that needs to be fixed, please see inline. >> > >> > BTW: A couple of days ago, I hav

Re: [PATCH] i386: correct division modeling in lujiazui.md

2022-12-29 Thread Mayshao-oc
>Ping. If there are any questions or concerns about the patch, please let me >know: I'm interested in continuing this cleanup at least for older AMD models. > Hi Alexander: According to the speccpu2017 benchmark result, the patch looks good in lujiazui. BR Mayshao >I noticed I had an ext

答复: [PATCH] i386: correct division modeling in lujiazui.md

2022-12-20 Thread Mayshao-oc
>Ping. If there are any questions or concerns about the patch, please let me >know: I'm interested in continuing this cleanup at least for older AMD models. > Thanks for your patch. We are running benchmark on speccpu2017 to get the performance number, it takes some time. If we get the result , w

Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-27 Thread Mayshao-oc
>> >> Hi Martin: >> Thanks for your patch, I comment the questions below. >Hi. >:) >> >>> Hello. >> >>> I noticed this patch set which is kind of related to >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364. >> >>> And I have a couple of questions: >> >>>1) I noticed you drop AVX

Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-26 Thread Mayshao-oc
Hi Martin: Thanks for your patch, I comment the questions below. > Hello. > I noticed this patch set which is kind of related to > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364. > And I have a couple of questions: >1) I noticed you drop AVX and F16C features for the newly added "l

Re: [PATCH] [x86_64]: Zhaoxin lujiazui enablement

2022-05-17 Thread Mayshao-oc
> On Tue, May 17, 2022 at 5:15 AM mayshao wrote: >> Hi Uros: >> This patch fix Zhaoxin CPU vendor ID detection problem and add >> zhaoxin "lujiazui" processor support. >> Currently gcc can't recognize Zhaoxin CPU(vendor ID "CentaurHauls" >> and "Shanghai") if user use -march=nati

Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-03-28 Thread Mayshao-oc
On Sun, Mar 27, 2022 at 5:15 PM Uros Bizjak wrote: > On Fri, Mar 25, 2022 at 3:08 AM MayShao wrote: > > > > Hi Uros, > > > > This patch fix Zhaoxin CPU Vendor ID detection problem > > and add Zhaoxin "lujiazui" processor support and tuning. > > > > Currently gcc can't recognize Zhaoxin CPU (Vendo