https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115921
--- Comment #1 from Siarhei Volkau ---
Also take in account examples like this:
uint32_t high_const_and_compare(uint32_t x)
{
if ( (x & 0x7000) == 0x3000)
return do_some();
return do_other();
}
It might be profitable to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115922
Bug ID: 115922
Summary: Missed optimization: MIPS: clear bit 15
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115921
Bug ID: 115921
Summary: Missed optimization: and->ashift might be cheaper than
ashift->and on typical RISC targets
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115505
Bug ID: 115505
Summary: missing optimization: thumb1 use ldmia/stmia for load
store DI/DF data when possible
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #16 from Siarhei Volkau ---
Might it be that LoongArch have register reuse dependency?
I observed similar behavior on XBurst with load/store/reuse pattern:
e.g. this code
LW $v0, 0($t1)# Xburst load latency is 4 but it has
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #15 from Siarhei Volkau ---
Created attachment 58437
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58437=edit
application to test performance of shift
Here is the test application (MIPS32 specific) I wrote.
It allows to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #12 from Siarhei Volkau ---
Highly likely it's because of data dependency, and not direct cost of shift
operations on LoongArch, although can't find information to prove that.
So, I guess it still might get performance benefit in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #8 from Siarhei Volkau ---
Created attachment 58377
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58377=edit
condmove testcase
Tested with current GCC master branch:
- Work with -Os confirmed.
- Condmove issue present in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #6 from Siarhei Volkau ---
Well, it is work mostly well.
However, it still has issues, addressed in my patch:
1) Doesn't work for -Os : highly likely costing issue.
2) Breaks condmoves, as mine does. I have no idea how to avoid
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
--- Comment #3 from Siarhei Volkau ---
I know that the patch breaks condmove cases, that's why it is silly.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112474
--- Comment #1 from Siarhei Volkau ---
Minimal example for showcase the issue:
#include
uint64_t mthi_example(uint32_t a, uint32_t b, uint32_t c, uint32_t d)
{
uint64_t ret;
ret = (uint64_t)a * b + (uint64_t)c * d + 1u;
return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112474
Bug ID: 112474
Summary: MIPS: missed optimization for assigning HI reg to zero
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112398
--- Comment #3 from Siarhei Volkau ---
Well, let's rewrite it in that way:
void neg8 (uint8_t *restrict dst, const uint8_t *restrict src)
{
uint8_t work = ~*src; // or *src ^ 0xff;
dst[0] = (work >> 4) | (work << 4);
}
Wherever upper
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112398
Bug ID: 112398
Summary: Suboptimal code generation for xor pattern on subword
data
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835
--- Comment #3 from Siarhei Volkau ---
I don't think that it is duplicate of the bug 104387 because there's only one
store.
And this bug is simply disappears if we change the source code a bit.
e.g.
- change (int8_t)*src; to *(int8_t*)src;
or
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111384
Siarhei Volkau changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104387
--- Comment #4 from Siarhei Volkau ---
*** Bug 111384 has been marked as a duplicate of this bug. ***
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835
Bug ID: 111835
Summary: Suboptimal codegen: zero extended load instead of sign
extended one
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111626
Bug ID: 111626
Summary: missed optimization combining offset of array member
in struct with offset inside the array
Product: gcc
Version: unknown
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111384
--- Comment #2 from Siarhei Volkau ---
Well what the godbolt says with -O2 -fomit-frame-pointer.
ARM:
uxthr0, r0 @ << zero extend
strhr0, [r1]
strhr0, [r2]
bx lr
ARM64:
and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111384
Bug ID: 111384
Summary: missed optimization: GCC adds extra any extend when
storing subreg#0 multiple times
Product: gcc
Version: unknown
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111378
Bug ID: 111378
Summary: Missed optimization for comparing with exact_log2
constants
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376
Bug ID: 111376
Summary: missed optimization of one bit test on MIPS32r1
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60749
Siarhei Volkau changed:
What|Removed |Added
CC||lis8215 at gmail dot com
--- Comment
24 matches
Mail list logo