https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #16 from luoxhu at gcc dot gnu.org ---
The attached files are all built with -mcpu=power8 and the case also fails on
P8LE.
Also I verified the code produces expected output on P8BE. ('Aborted' is caused
by BE returns 0x41
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #15 from luoxhu at gcc dot gnu.org ---
In combine: vec_select(vec_concat and the followed vec_select are combined to a
single extract instruction, which seems reasonable for both LE and BE?
R146: 0 1 2 3
R141: 4 5 6 7
R150: 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #14 from luoxhu at gcc dot gnu.org ---
Created attachment 53354
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53354&action=edit
split2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #13 from luoxhu at gcc dot gnu.org ---
Created attachment 53353
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53353&action=edit
after combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #12 from luoxhu at gcc dot gnu.org ---
Created attachment 53352
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53352&action=edit
combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293
--- Comment #5 from luoxhu at gcc dot gnu.org ---
r12-6086
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293
--- Comment #4 from luoxhu at gcc dot gnu.org ---
Could you try revert (In reply to Richard Biener from comment #2)
> I can reproduce a regression with -Ofast -march=znver2 running on Haswell as
> well. -fopt-info doesn't reve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105740
--- Comment #10 from luoxhu at gcc dot gnu.org ---
(In reply to Martin Liška from comment #9)
> (In reply to luoxhu from comment #8)
> > (In reply to rguent...@suse.de from comment #6)
> > > On Tue, 21 Jun 2022, jakub at gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #8 from luoxhu at gcc dot gnu.org ---
init-regs:
(insn 13 8 17 2 (set (reg:V4SI 141)
(vec_select:V4SI (vec_concat:V8SI (reg/v:V4SI 135 [ R2 ])
(reg/v:V4SI 133 [ R0 ]))
(parallel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #5 from luoxhu at gcc dot gnu.org ---
Seems combine wrongly merged two vec_select instructions:
Trying 188 -> 199:
188: r343:V4SI=vec_select(vec_concat(r168:V4SI,r338:V4SI),parallel)
REG_DEAD r338:V4SI
REG_DEAD r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #4 from luoxhu at gcc dot gnu.org ---
Reduced to:
#include
extern "C" void *memcpy(void *, const void *, unsigned long);
typedef __attribute__((altivec(vector__))) unsigned native_simd_type;
union {
native_s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106126
--- Comment #13 from luoxhu at gcc dot gnu.org ---
Otherwise we need record first_bb when conditions_in_bbs->is_empty, then check
that in is_beneficial, ordered_remove the info entry if that bb is not the
first "if condition" wit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106126
--- Comment #12 from luoxhu at gcc dot gnu.org ---
conditions_in_bbs->is_empty doesn't mean that range is at the start of switch
condition:(, so couldn't assume to ignore the no_side_effect_bb check?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106126
--- Comment #11 from luoxhu at gcc dot gnu.org ---
Sorry for breaking, my bugzilla account is luo...@gcc.gnu.org.
The patch seems reasonable to fold 65-90 ('A'-'Z') to switch statement,
4,6c4,6
< ;; Canonical GIMPLE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105903
--- Comment #2 from luoxhu at gcc dot gnu.org ---
diff --git a/gcc/match.pd b/gcc/match.pd
index 4a570894b2e..f6b5415a351 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5718,6 +5718,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(bit_xor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069
--- Comment #2 from luoxhu at gcc dot gnu.org ---
Could you also paste the ASM difference please? (I don't have environment at
handle so far..)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105740
--- Comment #8 from luoxhu at gcc dot gnu.org ---
(In reply to rguent...@suse.de from comment #6)
> On Tue, 21 Jun 2022, jakub at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105740
> >
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105740
--- Comment #2 from luoxhu at gcc dot gnu.org ---
Run if_to_switch and convert_switch again after copyprop2 could remove the
redundant statement and expose opportunity for if-to-switch again, is this
reasonable or just move if-to-switch/switch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100034
--- Comment #2 from luoxhu at gcc dot gnu.org ---
(In reply to Richard Biener from comment #1)
> Looks related to PR1 - we do an IPA SRA clone but fail to inline it and
> thus we end up with
>
> void d.isra ()
> {
> int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93318
--- Comment #10 from luoxhu at gcc dot gnu.org ---
And the Profile id of that node is streamed to many objects after lto
partition:
grep -- "19598949" **
db_server.ltrans0.000i.cgraph: Profile id: 19598949
db_server.ltrans0.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93318
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105133
--- Comment #2 from luoxhu at gcc dot gnu.org ---
(In reply to Richard Biener from comment #1)
> (In reply to luoxhu from comment #0)
> >
> > cat hellow.res
> > 3
> > hello.o 2
> > 192 ccb9165e037
Priority: P3
Component: lto
Assignee: unassigned at gcc dot gnu.org
Reporter: luoxhu at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
Target Milestone: ---
Hi, linker gold supports --start-lib and --end-lib to "mimics the
semantics of static libr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103820
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802
--- Comment #6 from luoxhu at gcc dot gnu.org ---
(In reply to Richard Biener from comment #5)
> So the point is that P is invariant but we do not hoist it because it's
> computed in a (estimated) cold block? I notice that the c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802
--- Comment #4 from luoxhu at gcc dot gnu.org ---
Or restore the previous recip count check by comment out the if condition to
avoid bb in loop turns cold?
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/recip-3.c
b/gcc/testsuite/gcc.dg/tree-ssa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103793
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Resolution|--- |FIXED
Status
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94790
--- Comment #4 from luoxhu at gcc dot gnu.org ---
Just noticed they are different case, scalar vs. vector...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94790
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802
--- Comment #2 from luoxhu at gcc dot gnu.org ---
-funroll-loops could work around this, is this reasonable?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103802
--- Comment #1 from luoxhu at gcc dot gnu.org ---
MOVE_MAX_PIECES is 4 on m32 but it is 8 on m64, then estimate_move_cost is
different between them 2 vs 1 for “((size + MOVE_MAX_PIECES - 1) /
MOVE_MAX_PIECES)".
recip-3.m32.c.172t.cunroll:
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: luoxhu at gcc dot gnu.org
Target Milestone: ---
Invoking the compiler as /home/luoxhu/workspace/gcc-master_build/gcc/xgcc
-B/home/luoxhu/workspace/gcc-master_build/gcc/
/home/luoxhu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103793
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |luoxhu at gcc dot
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860
--- Comment #6 from luoxhu at gcc dot gnu.org ---
Fortran's modulo is floor_mod as documented here:
https://gcc.gnu.org/onlinedocs/gfortran/MODULO.html?
Syntax:
RESULT = MODULO(A, P)
Return value:
The type and kind of the result are tho
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239
--- Comment #11 from luoxhu at gcc dot gnu.org ---
+(define_insn_and_split "*anddi3_insn_dot"
+ [(set (pc)
+(if_then_else (eq (and:DI (match_operand:DI 1 "gpc_reg_operand" "%r,r")
+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239
--- Comment #9 from luoxhu at gcc dot gnu.org ---
(In reply to Segher Boessenkool from comment #8)
> (In reply to luoxhu from comment #6)
> > > > foo:
> > > > .LFB0:
> > > > .cfi_sta
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239
--- Comment #7 from luoxhu at gcc dot gnu.org ---
1| Dump of assembler code for function foo:
2|0x15e0 <+0>: rldicr. r3,r3,29,1
3+> 0x15e4 <+4>: beq 0x15f0
4|0x15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239
--- Comment #6 from luoxhu at gcc dot gnu.org ---
(In reply to Segher Boessenkool from comment #5)
> (In reply to luoxhu from comment #4)
> > Simply adjust the sequence of dot instruction could produce expected code,
> > is this c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239
--- Comment #4 from luoxhu at gcc dot gnu.org ---
Simply adjust the sequence of dot instruction could produce expected code, is
this correct?
foo:
.LFB0:
.cfi_startproc
rldicr. 3,3,29,1
beq 0,.L2
#APP
# 10 "pr102
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102239
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
--- Comment #5 from luoxhu at gcc dot gnu.org ---
;; Loop 0
;; header 0, latch 1
;; depth 0, outer -1
;; nodes: 0 1 2 3 4 5 6 11 7 8 10 9
;;
;; Loop 1
;; header 8, latch 7
;; depth 1, outer 0
;; nodes: 8 7 6 10 5 4 11 3
;;
;; Loop 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
--- Comment #4 from luoxhu at gcc dot gnu.org ---
Created attachment 51851
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51851&action=edit
Fix incorrect loop exit edge probability
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
--- Comment #3 from luoxhu at gcc dot gnu.org ---
The profile count is correct but something wrong with edge probability, and it
turns out that r12-4526 exposes a long-existing issue in
profile_estimate:predict_extra_loop_exits, when searching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103270
--- Comment #2 from luoxhu at gcc dot gnu.org ---
(In reply to Richard Biener from comment #1)
> So you say this is a problem with loop header copying, that would mean the
> issue is really latent and general, no? Header copyin
Severity: normal
Priority: P3
Component: testsuite
Assignee: unassigned at gcc dot gnu.org
Reporter: luoxhu at gcc dot gnu.org
Target Milestone: ---
For the testcase gcc.dg/vect/pr96698.c, the inner loop was hot (preheader count
< loop count), but it is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991
--- Comment #7 from luoxhu at gcc dot gnu.org ---
Fixed, will backport to gcc-11 in a week.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||ro at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103041
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103041
--- Comment #1 from luoxhu at gcc dot gnu.org ---
Could you please verify whether it is caused by r12-4818 instead of r12-4819?
r12-4819 is a NFC patch which seems more unlikely, and r12-4818 also ICEs in
PR103029, it is possibly a duplicate of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991
--- Comment #5 from luoxhu at gcc dot gnu.org ---
P9:
.L149:
lxvx %vs32,%r8,%r10
vadduwm %v12,%v12,%v1
mfvsrd %r5,%vs43
mfvsrld %r4,%vs43
vadduwm %v11,%v11,%v9
stxv %vs44,112(%r1)
xxperm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991
--- Comment #4 from luoxhu at gcc dot gnu.org ---
vect-simd-17.p10.c.335r.final:
3379: %v1:V16QI=unspec[%v1:V16QI,%v1:V16QI,%v9:V16QI] 254
3372: {%v11:V4SI=~%v0:V4SI&%v13:V4SI|%v11:V4SI;clobber %r10:V4SI;} // wrong
code.
REG_DEAD %v0:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029
--- Comment #3 from luoxhu at gcc dot gnu.org ---
This hack could restore the previous phi order to put nondfs phi args before
dfs_edge args. But I am not sure whether this is the correct direction. At
least it proves that the phi order
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103029
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991
--- Comment #3 from luoxhu at gcc dot gnu.org ---
(In reply to Kewen Lin from comment #2)
> (In reply to luoxhu from comment #1)
> > Couldn't reproduce on rain6p1 (P10):
> >
>
> It's weird, I can reproduce thi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102991
--- Comment #1 from luoxhu at gcc dot gnu.org ---
Couldn't reproduce on rain6p1 (P10):
Test run by luoxhu on Fri Oct 29 04:08:49 2021
Native configuration is powerpc64le-unknown-linux-gnu
=== gcc tests ===
Schedu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102868
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Resolution|--- |FIXED
Status
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94613
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102868
--- Comment #1 from luoxhu at gcc dot gnu.org ---
Patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582452.html
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: luoxhu at gcc dot gnu.org
Target Milestone: ---
Similar to PR94680 and PR100165, PPC currently generates inefficient
instructions for below case:
typedef float V __attribute__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102075
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178
--- Comment #2 from luoxhu at gcc dot gnu.org ---
Verified 470.lbm doesn't show regression on Power8 with Ofast.
runtime is 141 sec for r12-897, without that patch it is 142 sec.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102008
--- Comment #3 from luoxhu at gcc dot gnu.org ---
phiopt4 and sink2 are doing reverse optimizations:
pr102008.c.200t.phiopt4:
Hoisting adjacent loads from 3 and 4 into 2: _6 = foo_4(D)->a; _5 =
foo_4(D)->b;
pr102008.c.202t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102008
--- Comment #2 from luoxhu at gcc dot gnu.org ---
Confirmed if move the sink2 pass before phiopt4 could restore the previous
instructons for this case:
test:
.LFB0:
.cfi_startproc
cmp w0, 1
ldp w0, w1, [x1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #15 from luoxhu at gcc dot gnu.org ---
Patch updated:
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578740.html
: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: luoxhu at gcc dot gnu.org
Target Milestone: ---
ALWAYS_EXECUTED_IN is not computed completely for nested loops. Current design
will exit if an inner loop doesn't dominate outer loop's latch or exit aft
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101250
--- Comment #1 from luoxhu at gcc dot gnu.org ---
Patch posted:
[PATCH] ivopts: Don't adjust IV update statement if both operands use the IV in
COND [PR101250]
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573894.html
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: luoxhu at gcc dot gnu.org
Target Milestone: ---
Test case:
unsigned int foo (unsigned char *ip, unsigned char *ref, unsigned int maxlen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
--- Comment #13 from luoxhu at gcc dot gnu.org ---
It is not visible in combine due to the constant data is in *.LC0 and
UNSPEC_VPERM. Will shelf this and switch to other high priority issues.
pr100866.c.277r.combine:
(note 4 0 20 2 [bb 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
--- Comment #8 from luoxhu at gcc dot gnu.org ---
(In reply to Jens Seifert from comment #7)
> Regarding vec_revb for vector unsigned int. I agree that
> revb:
> .LFB0:
> .cfi_startproc
> vspltish %v1,8
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
--- Comment #6 from luoxhu at gcc dot gnu.org ---
For V4SI, it is also better to use vector splat and vector rotate operations.
revb:
.LFB0:
.cfi_startproc
vspltish %v1,8
vspltisw %v0,-16
vrlh %v2,%v2,%v1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93571
--- Comment #3 from luoxhu at gcc dot gnu.org ---
BTW, I didn't see performance difference between fmr and xxlor within a small
benchmark.
Max Ops Per CycleLatency (Min) Latency (Max)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93571
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
--- Comment #5 from luoxhu at gcc dot gnu.org ---
(In reply to Segher Boessenkool from comment #4)
> This PR is specifically about the vec_revb builtin. But yes, we should
> look at what is generated for all other code (having only the b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101020
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Resolution|--- |FIXED
Status
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
--- Comment #3 from luoxhu at gcc dot gnu.org ---
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 097a127be07..35b3f1a0e1a 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1932,7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101020
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #10 from luoxhu at gcc dot gnu.org ---
float128 to vector __int128 is fixed by:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f700e4b0ee3ef53b48975cf89be26b9177e3a3f3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #9 from luoxhu at gcc dot gnu.org ---
Patch sent, it could fix the __float128 to vector __int128 issue,
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571689.html
But for __float128 to __int128 mentioned in #c4, need hack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94613
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #12 from luoxhu at gcc dot gnu.org ---
Patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323
--- Comment #17 from luoxhu at gcc dot gnu.org ---
If the constant limitation is removed, it could be combined successfully with
my new patch for PR94613.
https://gcc.gnu.org/pipermail/gcc-patches/2021-April/569255.html
And what do you mean
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323
--- Comment #16 from luoxhu at gcc dot gnu.org ---
> +2016-11-09 Segher Boessenkool
> +
> + * simplify-rtx.c (simplify_binary_operation_1): Simplify
> + (xor (and (xor A B) C) B) to (ior (and A C) (and B ~C)) and
&g
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #10 from luoxhu at gcc dot gnu.org ---
If not built with fast-math, gimple_has_side_effects will return true and cause
the expand_call_stmt fail to expand the "_1 = fmod (x_2(D), y_3(D));" to
internal function. X86 also pr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323
--- Comment #15 from luoxhu at gcc dot gnu.org ---
(In reply to Segher Boessenkool from comment #14)
> (In reply to luoxhu from comment #12)
> > That code was called by combine pass but fail to match.
>
> >
> > pr new
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323
--- Comment #12 from luoxhu at gcc dot gnu.org ---
That code was called by combine pass but fail to match.
pr newpat
(set (reg:DI 125 [ l ])
(xor:DI (and:DI (xor:DI (reg/v:DI 120 [ l ])
(reg:DI 127))
(const_int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323
--- Comment #11 from luoxhu at gcc dot gnu.org ---
I noticed that you added the below optimization with commit
a62436c0a505155fc8becac07a8c0abe2c265bfe. But it doesn't even handle this case,
cse1 pass will call simplify_binary_operation_1,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323
--- Comment #9 from luoxhu at gcc dot gnu.org ---
Then we could optimized it in match.pd
diff --git a/gcc/match.pd b/gcc/match.pd
index 036f92fa959..8944312c153 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3711,6 +3711,17
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
CC||luoxhu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99718
luoxhu at gcc dot gnu.org changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99718
--- Comment #19 from luoxhu at gcc dot gnu.org ---
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567395.html
This patch extends variable vec_insert to all 32bit VSX targets including
Power7{BE} {32,64}, Power8{BE}{32, 64}, Power8{LE}{64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99718
--- Comment #15 from luoxhu at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #14)
> You still have:
> if (VECTOR_MEM_VSX_P (mode))
> {
> if (!CONST_INT_P (elt_rtx))
> {
> if
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99718
--- Comment #13 from luoxhu at gcc dot gnu.org ---
Performance data in #c11 is for int variable vec_insert of 32bit mode, the
float variable vec_insert of 32-bit is a bit slower but much better than
original(extra stfs+lwz of insn #17 and insn 18
1 - 100 of 166 matches
Mail list logo