from:"luc.vanoostenryck at gmail dot com via Gcc\-bugs"

[Bug tree-optimization/102486] __builtin_popcount(y&-y) is not optimized to 1

2021-09-26 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102486

Luc Van Oostenryck  changed:

   What|Removed |Added

 CC||luc.vanoostenryck at gmail dot 
com

--- Comment #1 from Luc Van Oostenryck  ---
when y != 0

[Bug rtl-optimization/100377] needless stack adjustment when passing struct in register

2021-05-02 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100377

--- Comment #3 from Luc Van Oostenryck  ---
> I thought there was one which I filed which is much older than those but I
> can't find it.

Probably also related to PR36409 and PR49157

[Bug rtl-optimization/100378] New: [Regression 9/10/11/12] arm64: lsl + asr used instead of sxth

2021-05-01 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100378

Bug ID: 100378
   Summary: [Regression 9/10/11/12] arm64: lsl + asr used instead
of sxth
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luc.vanoostenryck at gmail dot com
  Target Milestone: ---

Created attachment 50727
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50727=edit
testcase

On arm64, when compiling with optimization, for example with -O2,
the following code:

struct sh {
short a;
short b;
short y[2];
};
int fooh(struct sh s) { return s.a; }

produces the following assembly code since GCC9.x:
fooh:
lsl x0, x0, 16
asr w0, w0, 16
ret

but with GCC8.x and before it produces the shorter:
fooh(sh):
sxthw0, w0
ret


See https://gcc.godbolt.org/z/YrW7E3cro

[Bug rtl-optimization/100377] New: needless stack adjustment when passing struct in register

2021-05-01 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100377

Bug ID: 100377
   Summary: needless stack adjustment when passing struct in
register
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luc.vanoostenryck at gmail dot com
  Target Milestone: ---

Created attachment 50726
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50726=edit
testcases

When compiling with optimization for example -O2), the following code:

struct sb {
signed char a;
char b;
short y[3];
};
struct ub {
unsigned char a;
char b;
short y[3];
};
int fsb(struct sb s) { return s.a; }
int fub(struct ub s) { return s.a; }

produces the following assembly code on arm64:
fsb:
sub sp, sp, #16
sxtbw0, w0
add sp, sp, 16
ret
fub:
sub sp, sp, #16
and w0, w0, 255
add sp, sp, 16
ret

the following on mips64:
fsb:
daddiu  $sp,$sp,-16
dsll$2,$4,56
dsra$2,$2,56
j   $31
daddiu  $sp,$sp,16

fub:
daddiu  $sp,$sp,-16
andi$2,$4,0xff
j   $31
daddiu  $sp,$sp,16

the following on riscv64:
fsb:
addisp,sp,-16
sllia0,a0,24
sraia0,a0,24
addisp,sp,16
jr  ra
fub:
addisp,sp,-16
andia0,a0,0xff
addisp,sp,16
jr  ra

OTOH, things seems OK on ppc64:
fsb:
extsb 3,3
blr
fub:
rlwinm 3,3,0,0xff
blr

and x86_64:
fsb:
movsx   eax, dil
ret
fub:
movzx   eax, dil
ret


Similar problems happen on 32-bit platforms too.
For example on arm32, the following code:
struct ub32 {
unsigned char a;
char b;
short y[1];
};
int fub32(struct ub32 s) { return s.a; }

produces:
fub32:
sub sp, sp, #8
uxtbr0, r0
add sp, sp, #8
bx  lr


All these seem to happen on all versions.
See https://gcc.godbolt.org/z/x9zc1EnYn

Note: similar PRs exist but reported for x86_64 only

[Bug target/100075] [9/10 Regression] unneeded sign extension

2021-04-16 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100075

--- Comment #4 from Luc Van Oostenryck  ---
(In reply to Jakub Jelinek from comment #3)
> Fixed on the trunk.  Probably shouldn't be backported.

Work great here. Thanks.

[Bug target/100056] [9/10 Regression] orr + lsl vs. [us]bfiz

2021-04-15 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100056

--- Comment #11 from Luc Van Oostenryck  ---
Works nicely now.
Thank you.

[Bug target/100028] [9/10 Regression] arm64 failure to generate bfxil

2021-04-15 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100028

--- Comment #8 from Luc Van Oostenryck  ---
Woks nicely now.
Thanks

[Bug target/100075] New: [9/10/11 Regression] unneeded sign extension

2021-04-13 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100075

Bug ID: 100075
   Summary: [9/10/11 Regression] unneeded sign extension
   Product: gcc
   Version: 10.3.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luc.vanoostenryck at gmail dot com
  Target Milestone: ---
Target: aarch64

Created attachment 50588
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50588=edit
test case

Until gcc8, the following code:
struct s {
short x, y;
};
struct s rot(struct s p)
{
return (struct s) { -p.y, p.x };
}

was translated:
rot90:
neg w1, w0, asr 16
and w1, w1, 65535
orr w0, w1, w0, lsl 16
ret

but since gcc9 it translates less nicely, with an unneeded sign extension:
rot90:
mov w1, w0
sbfxx0, x1, 16, 16
neg w0, w0
bfi w0, w1, 16, 16
ret


See with another variant in attachment or https://gcc.godbolt.org/z/1oW1cEMGc

[Bug target/100072] New: [10/11 Regression] csel vs. csetm + and

2021-04-13 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100072

Bug ID: 100072
   Summary: [10/11 Regression] csel vs. csetm + and
   Product: gcc
   Version: 10.3.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luc.vanoostenryck at gmail dot com
  Target Milestone: ---
Target: aarch64

Created attachment 50587
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50587=edit
testcase

The following code:
int sel_andn(int p, int a) { return (p ? ~0 : 0) & a; }
int sel_andr(int p, int a) { return (p ? 0 : ~0) & a; }

translated to the following with GCC9 and before:
sel_andn:
cmp w0, 0
cselw0, w1, wzr, ne
ret
sel_andr:
cmp w0, 0
cselw0, w1, wzr, eq
ret

but since version 10 it translates into:
sel_andn:
cmp w0, 0
csetm   w0, ne
and w0, w0, w1
ret
sel_andr:
cmp w0, 0
csetm   w0, eq
and w0, w0, w1
ret

Same at https://gcc.godbolt.org/z/16fj1EYhx

[Bug target/100056] [9/10/11 Regression] orr + lsl vs. [us]bfiz

2021-04-13 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100056

--- Comment #7 from Luc Van Oostenryck  ---
Created attachment 50585
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50585=edit
newer testcases (with 32 -> 64-bit extensions)

[Bug target/100056] [9/10/11 Regression] orr + lsl vs. [us]bfiz

2021-04-13 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100056

--- Comment #6 from Luc Van Oostenryck  ---
(In reply to Jakub Jelinek from comment #3)
> Created attachment 50583 [details]
> gcc11-pr100056.patch
> 
> Untested fix.

OTOH, for the signed case things seems to be OK unless the
sign extension is one of the register sizes (8, 16 & 32).

See the updated testcases in attachment.

[Bug target/100056] [9/10/11 Regression] orr + lsl vs. [us]bfiz

2021-04-13 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100056

--- Comment #5 from Luc Van Oostenryck  ---
Created attachment 50584
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50584=edit
updated test cases

[Bug target/100056] [9/10/11 Regression] orr + lsl vs. [us]bfiz

2021-04-13 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100056

--- Comment #4 from Luc Van Oostenryck  ---
(In reply to Jakub Jelinek from comment #3)
> Created attachment 50583 [details]
> gcc11-pr100056.patch
> 
> Untested fix.

Mmmm, that's working fine for the cases I had but not in
more general cases. I think that the constraint on the AND
may be too tight. For example, changing things slightly to
have a smaller mask:
int or_lsl_u3(unsigned i) {
i &= 7;
return i | (i << 11);
}

still gives:
or_lsl_u3:
and w1, w0, 7
ubfiz   w0, w0, 11, 3
orr w0, w0, w1
ret

while GCC8 gave the expected:
or_lsl_u3:
and w0, w0, 7
orr w0, w0, w0, lsl 11
ret

In fact, I would tend to think that the AND part should be
removed from your split pattern (some kind of zero-extension
seems to be needed to reproduce the problem but that's all).

[Bug target/100056] New: [9/10/11 Regression]

2021-04-12 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100056

Bug ID: 100056
   Summary: [9/10/11 Regression]
   Product: gcc
   Version: 9.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luc.vanoostenryck at gmail dot com
  Target Milestone: ---
Target: aarch64

Created attachment 50573
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50573=edit
or-shift vs. [us]bfiz

On arm64, the following code:
unsigned or_shift(unsigned char i)
{
return i | (i << 11);
}

translate to the following assembly:
or_shift:
and w1, w0, 255
ubfiz   w0, w0, 11, 8
orr w0, w0, w1
ret

where the ubfiz instruction is a bit weird since the code
matches directly what was generated in gcc 8.x and before:
or_shift:
and w0, w0, 255
orr w0, w0, w0, lsl 11
ret

Same with a signed argument (see https://gcc.godbolt.org/z/af4zffMYa ).

[Bug target/100028] [9/10/11 Regression] arm64 failure to generate bfxil

2021-04-12 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100028

--- Comment #5 from Luc Van Oostenryck  ---
(In reply to Jakub Jelinek from comment #4)
> Created attachment 50571 [details]
> gcc11-pr100028.patch
> 
> Untested fix.

This solve the few cases I had.
Thanks.

[Bug rtl-optimization/100046] New: compare with itself

2021-04-12 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100046

Bug ID: 100046
   Summary: compare with itself
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luc.vanoostenryck at gmail dot com
  Target Milestone: ---

Created attachment 50569
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50569=edit
compare with itself

The attached file reproduced here:

int b3_06(int x, int y, int z) {
  int a = (x | z) ^ (y | z);
  int b = (x ^ y) & ~z;
  return a == b;
}

The generated assembly for for arm64 is:
b3_06:
eor w3, w1, w0
bic w3, w3, w2
cmp w3, w3
csetw0, eq
ret

So, GCC is able to see that both expressions are equivalent. Nice.
But then there is this compare with itself :(

The problem seems to exist forever on all targets (see
https://gcc.godbolt.org/z/qrYWsznof ).

[Bug target/100028] New: arm64 failure to generate bfxil

2021-04-10 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100028

Bug ID: 100028
   Summary: arm64 failure to generate bfxil
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: luc.vanoostenryck at gmail dot com
  Target Milestone: ---
Target: aarch64

Created attachment 50555
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50555=edit
should generate bfxil but doesn't

The attached code reproduced here:

#define W   3
#define L   11

int bfxil(int d, int s)
{
int wmask = (1 << W) - 1;
return (d & ~wmask) | ((s >> L) & wmask);
}

Should return:
bfxil:
bfxil   w0, w1, 11, 3
ret

but instead returns:
bfxil:
ubfxx1, x1, 11, 3
and w0, w0, -8
orr w0, w1, w0
ret

The problem is still present in trunk, was also present in 9.3 but wasn't in
GCC 8.2 (see https://gcc.godbolt.org/z/E6z31hr9r ).

[Bug c/92935] typeof() on an atomic type doesn't always return the corresponding unqualified type

2021-03-16 Thread luc.vanoostenryck at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92935

Luc Van Oostenryck  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #5 from Luc Van Oostenryck  ---
The incoherence is now fixed with thanks to
commit r11-5397-g768ce4f0ceb030e38427e85e483ed44330cd5da7

[Bug tree-optimization/102486] __builtin_popcount(y&-y) is not optimized to 1

[Bug rtl-optimization/100377] needless stack adjustment when passing struct in register

[Bug rtl-optimization/100378] New: [Regression 9/10/11/12] arm64: lsl + asr used instead of sxth

[Bug rtl-optimization/100377] New: needless stack adjustment when passing struct in register

[Bug target/100075] [9/10 Regression] unneeded sign extension

[Bug target/100056] [9/10 Regression] orr + lsl vs. [us]bfiz

[Bug target/100028] [9/10 Regression] arm64 failure to generate bfxil

[Bug target/100075] New: [9/10/11 Regression] unneeded sign extension

[Bug target/100072] New: [10/11 Regression] csel vs. csetm + and

[Bug target/100056] [9/10/11 Regression] orr + lsl vs. [us]bfiz

[Bug target/100056] [9/10/11 Regression] orr + lsl vs. [us]bfiz

[Bug target/100056] [9/10/11 Regression] orr + lsl vs. [us]bfiz

[Bug target/100056] [9/10/11 Regression] orr + lsl vs. [us]bfiz

[Bug target/100056] New: [9/10/11 Regression]

[Bug target/100028] [9/10/11 Regression] arm64 failure to generate bfxil

[Bug rtl-optimization/100046] New: compare with itself

[Bug target/100028] New: arm64 failure to generate bfxil

[Bug c/92935] typeof() on an atomic type doesn't always return the corresponding unqualified type

18 matches

Site Navigation

Mail list logo

Footer information