https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
Kewen Lin changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #36 from CVS Commits ---
The master branch has been updated by Kewen Lin :
https://gcc.gnu.org/g:f5e18dd9c7dacc9671044fc669bd5c1b26b6bdba
commit r11-4637-gf5e18dd9c7dacc9671044fc669bd5c1b26b6bdba
Author: Kewen Lin
Date: Tue Nov 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
rsandifo at gcc dot gnu.org changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #34 from Hongtao.liu ---
(In reply to Kewen Lin from comment #29)
> (In reply to Hongtao.liu from comment #28)
> > > Probably you can try to tweak it in ix86_add_stmt_cost? when the statement
> >
> > Yes, it's the place.
> >
> > > i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #33 from Richard Biener ---
(In reply to Kewen Lin from comment #32)
> (In reply to Richard Biener from comment #31)
> > (In reply to Kewen Lin from comment #29)
> > > (In reply to Hongtao.liu from comment #28)
> > > > > Probably you
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #32 from Kewen Lin ---
(In reply to Richard Biener from comment #31)
> (In reply to Kewen Lin from comment #29)
> > (In reply to Hongtao.liu from comment #28)
> > > > Probably you can try to tweak it in ix86_add_stmt_cost? when the
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #31 from Richard Biener ---
(In reply to Kewen Lin from comment #29)
> (In reply to Hongtao.liu from comment #28)
> > > Probably you can try to tweak it in ix86_add_stmt_cost? when the statement
> >
> > Yes, it's the place.
> >
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #30 from Richard Biener ---
(In reply to Hongtao.liu from comment #23)
> > _813 = {_437, _448, _459, _470, _490, _501, _512, _523, _543, _554, _565,
> > _576, _125, _143, _161, _179};
>
> The cost of vec_construct in i386 backend i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #29 from Kewen Lin ---
(In reply to Hongtao.liu from comment #28)
> > Probably you can try to tweak it in ix86_add_stmt_cost? when the statement
>
> Yes, it's the place.
>
> > is UB to UH conversion statement, further check if the d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #28 from Hongtao.liu ---
> Probably you can try to tweak it in ix86_add_stmt_cost? when the statement
Yes, it's the place.
> is UB to UH conversion statement, further check if the def of the input UB
> is MEM.
Only if there's no m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #27 from Kewen Lin ---
(In reply to Hongtao.liu from comment #22)
> >One of my workmates found that if we disable vectorization for SPEC2017
> >>525.x264_r function sub4x4_dct in source file x264_src/common/dct.c with
> >?>explicit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #26 from Kewen Lin ---
> > By following this idea, to release the restriction on loop_outer
> > (loop_father) when setting the father_bbs, I can see FRE works as
> > expectedly. But it actually does the rpo_vn from cfun's entry to it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #25 from Kewen Lin ---
> >
> > Got it! For
> >
> > else if (vect_nop_conversion_p (stmt_info))
> > continue;
> >
> > Is it a good idea to change it to call record_stmt_cost like the others?
> > 1) introduce one ve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #24 from rguenther at suse dot de ---
On September 27, 2020 4:56:43 AM GMT+02:00, crazylht at gmail dot com
wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
>
>--- Comment #22 from Hongtao.liu ---
>>One of my workmates fou
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #23 from Hongtao.liu ---
> _813 = {_437, _448, _459, _470, _490, _501, _512, _523, _543, _554, _565,
> _576, _125, _143, _161, _179};
The cost of vec_construct in i386 backend is 64, calculated as 16 x 4
cut from i386.c
---
/* N e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #22 from Hongtao.liu ---
>One of my workmates found that if we disable vectorization for SPEC2017
>>525.x264_r function sub4x4_dct in source file x264_src/common/dct.c with
>?>explicit function attribute __attribute__((optimize("no-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #21 from Richard Biener ---
(In reply to Kewen Lin from comment #18)
> (In reply to Richard Biener from comment #10)
> > (In reply to Kewen Lin from comment #9)
> > > (In reply to Richard Biener from comment #8)
> > > > (In reply to K
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #20 from Richard Biener ---
(In reply to Kewen Lin from comment #19)
> (In reply to rguent...@suse.de from comment #17)
> > On Fri, 18 Sep 2020, linkw at gcc dot gnu.org wrote:
> >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #19 from Kewen Lin ---
(In reply to rguent...@suse.de from comment #17)
> On Fri, 18 Sep 2020, linkw at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
> >
> > --- Comment #15 from Kewen Lin ---
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #18 from Kewen Lin ---
(In reply to Richard Biener from comment #10)
> (In reply to Kewen Lin from comment #9)
> > (In reply to Richard Biener from comment #8)
> > > (In reply to Kewen Lin from comment #7)
> > > > Two questions in min
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #17 from rguenther at suse dot de ---
On Fri, 18 Sep 2020, linkw at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
>
> --- Comment #15 from Kewen Lin ---
> (In reply to rguent...@suse.de from comment #1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #16 from Hongtao.liu ---
I notice
0x5561dc0 _36 * 2 1 times scalar_stmt costs 16 in body
0x5561dc0 _38 * 2 1 times scalar_stmt costs 16 in body
0x5562df0 _36 * 2 1 times vector_stmt costs 16 in body
0x5562df0 _38 * 2 1 times vector_s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #15 from Kewen Lin ---
(In reply to rguent...@suse.de from comment #14)
> On Fri, 18 Sep 2020, linkw at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
> >
> > --- Comment #13 from Kewen Lin ---
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #14 from rguenther at suse dot de ---
On Fri, 18 Sep 2020, linkw at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
>
> --- Comment #13 from Kewen Lin ---
> > 2) on Power, the conversion from unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #13 from Kewen Lin ---
> 2) on Power, the conversion from unsigned char to unsigned short is nop
> conversion, when we counting scalar cost, it's counted, then add costs 32
> totally onto scalar cost. Meanwhile, the conversion from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #12 from Kewen Lin ---
> Thanks for the explanation! I'll look at it after checking 2). IIUC, the
> advantage to eliminate stores here looks able to get those things which is
> fed to stores and stores' consumers bundled, then get mo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #11 from Kewen Lin ---
(In reply to Richard Biener from comment #10)
> (In reply to Kewen Lin from comment #9)
> > (In reply to Richard Biener from comment #8)
> > > (In reply to Kewen Lin from comment #7)
> > > > Two questions in min
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #10 from Richard Biener ---
(In reply to Kewen Lin from comment #9)
> (In reply to Richard Biener from comment #8)
> > (In reply to Kewen Lin from comment #7)
> > > Two questions in mind, need to dig into it further:
> > > 1) from t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #9 from Kewen Lin ---
(In reply to Richard Biener from comment #8)
> (In reply to Kewen Lin from comment #7)
> > Two questions in mind, need to dig into it further:
> > 1) from the assembly of scalar/vector code, I don't see any sto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #8 from Richard Biener ---
(In reply to Kewen Lin from comment #7)
> Two questions in mind, need to dig into it further:
> 1) from the assembly of scalar/vector code, I don't see any stores needed
> into temp array d (array diff in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
Kewen Lin changed:
What|Removed |Added
Last reconfirmed||2020-09-16
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
Kewen Lin changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #5 from Richard Biener ---
testcase from https://github.com/mirror/x264/blob/master/common/dct.c
where FENC_STRIDE is 16 and FDEC_STRIDE 32
pixel is unsigned char, dctcoef is unsigned short
static inline void pixel_sub_wxh( dctcoef
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #4 from Richard Biener ---
This delays some checks to eventually support part of the BB vectorization
which is what succeeds here. I suspect that w/o vectorization we manage
to elide the tmp[] array but with the part vectorization th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #3 from Kewen Lin ---
Bisection shows it started to fail from r11-205.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #2 from Kewen Lin ---
Created attachment 49124
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49124&action=edit
sub4x4_dct SLP dumping
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
Richard Biener changed:
What|Removed |Added
Component|tree-optimization |target
Keywords|
37 matches
Mail list logo