Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-27 Thread Xi Ruoyao
On Wed, 2023-12-27 at 11:59 +0800, chenglulu wrote: > +FAIL: gcc.dg/pr86617.c scan-rtl-dump-times final "mem/v" 6 > > In r14-6818 the issue persists. I kind of chased the code and found that the > problem is like this: >   volatile unsigned char u8; > >   void test (void) >   { > u8 = u8 +

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-26 Thread chenglulu
在 2023/12/23 下午6:44, Xi Ruoyao 写道: On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: The performance drop has nothing to do with this patch. I found that the h264 performance compiled by r14-6787 compared to r14-6421 dropped by 6.4%. Then I guess we should create a bug report... The code

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-25 Thread Xi Ruoyao
On Mon, 2023-12-25 at 10:08 +0800, chenglulu wrote: > > 在 2023/12/24 下午8:59, Xi Ruoyao 写道: > > On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote: > > > On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote: > > > > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: > > > > > > The performance drop

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-24 Thread chenglulu
在 2023/12/24 下午8:59, Xi Ruoyao 写道: On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote: On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote: On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: The performance drop has nothing to do with this patch. I found that the h264 performance compiled

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-24 Thread Xi Ruoyao
On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote: > On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote: > > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: > > > > The performance drop has nothing to do with this patch. I found that > > > > the h264 performance compiled > > > > by r14-6787

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-23 Thread Xi Ruoyao
On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote: > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: > > > The performance drop has nothing to do with this patch. I found that the > > > h264 performance compiled > > > by r14-6787 compared to r14-6421 dropped by 6.4%. > > Then I guess we

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-23 Thread Xi Ruoyao
On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote: > > The performance drop has nothing to do with this patch. I found that the > > h264 performance compiled > > by r14-6787 compared to r14-6421 dropped by 6.4%. Then I guess we should create a bug report... >  But there is a problem. My

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-22 Thread chenglulu
在 2023/12/23 上午10:26, chenglulu 写道: 在 2023/12/22 下午3:21, chenglulu 写道: 在 2023/12/22 下午3:09, Xi Ruoyao 写道: On Fri, 2023-12-22 at 11:44 +0800, chenglulu wrote: 在 2023/12/21 下午8:00, chenglulu 写道: Sorry, I've been busy with something else these two days. I don't think there's anything wrong

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-22 Thread chenglulu
在 2023/12/22 下午3:21, chenglulu 写道: 在 2023/12/22 下午3:09, Xi Ruoyao 写道: On Fri, 2023-12-22 at 11:44 +0800, chenglulu wrote: 在 2023/12/21 下午8:00, chenglulu 写道: Sorry, I've been busy with something else these two days. I don't think there's anything wrong with the code, but I need to test the

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread chenglulu
在 2023/12/22 下午3:09, Xi Ruoyao 写道: On Fri, 2023-12-22 at 11:44 +0800, chenglulu wrote: 在 2023/12/21 下午8:00, chenglulu 写道: Sorry, I've been busy with something else these two days. I don't think there's anything wrong with the code, but I need to test the spec.:-) Hi, Ruoyao: After

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread Xi Ruoyao
On Fri, 2023-12-22 at 11:44 +0800, chenglulu wrote: > > 在 2023/12/21 下午8:00, chenglulu 写道: > > Sorry, I've been busy with something else these two days. I don't > > think there's anything wrong with the code, > > > > but I need to test the spec.:-) > > Hi, Ruoyao: > > After applying this

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread chenglulu
在 2023/12/21 下午8:00, chenglulu 写道: Sorry, I've been busy with something else these two days. I don't think there's anything wrong with the code, but I need to test the spec.:-) Hi, Ruoyao: After applying this patch, spec2006 464.h264 ref will have a 6.4% performance drop. So I'm going to

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread chenglulu
Sorry, I've been busy with something else these two days. I don't think there's anything wrong with the code, but I need to test the spec.:-) 在 2023/12/21 下午7:56, Xi Ruoyao 写道: Ping :). On Tue, 2023-12-12 at 14:47 +0800, Xi Ruoyao wrote: The problem with peephole2 is it uses a naive

Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread Xi Ruoyao
Ping :). On Tue, 2023-12-12 at 14:47 +0800, Xi Ruoyao wrote: > The problem with peephole2 is it uses a naive sliding-window algorithm > and misses many cases.  For example: > >     float a[1]; >     float t() { return a[0] + a[8000]; } > > is compiled to: > >     la.local    $r13,a >