from:"\"hoganmeier at gmail dot com\""

[Bug tree-optimization/87621] New: auto-vectorization fails for exponentiation code

2018-10-16 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87621

Bug ID: 87621
   Summary: auto-vectorization fails for exponentiation code
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/bgieBT

template 
T pow(T x, unsigned int n)
{
if (!n)
return 1;

T y = 1;
while (n > 1)
{
if (n%2)
y *= x;
x = x*x; // unsupported use in stmt
n /= 2;
}
return x*y;
}

void testVec(int* x)
{
// loop nest containing two or more consecutive inner loops cannot be
vectorized
for (int i = 0; i < 8; ++i)
x[i] = pow(x[i], 10);
}

[Bug tree-optimization/87621] auto-vectorization fails for exponentiation code

2018-10-16 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87621

--- Comment #1 from krux  ---
Interestingly it happily unrolls the loop even with -fno-unroll-loops.

[Bug rtl-optimization/84101] [7/8/9 Regression] -O3 and -ftree-vectorize trying too hard for function returning trivial pair-of-uint64_t-structure

2018-10-16 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84101

--- Comment #4 from krux  ---
Also happens with pairs of floats:
https://godbolt.org/z/QrP0VD

[Bug tree-optimization/87621] outer loop auto-vectorization fails for exponentiation code

2018-10-16 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87621

--- Comment #3 from krux  ---
Yes see the godbolt link.
clang compiles it down to a few vpmulld's.

[Bug c++/63149] wrong auto deduction from braced-init-list

2019-01-26 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63149

krux  changed:

   What|Removed |Added

 CC||hoganmeier at gmail dot com

--- Comment #3 from krux  ---
Still fails on trunk.

[Bug lto/90369] New: error: could not unlink output file

2019-05-06 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90369

Bug ID: 90369
   Summary: error: could not unlink output file
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Tested this ARM toolchain:
http://www.freddiechopin.info/en/download/category/11-bleeding-edge-toolchain
In a very specific case I get the aforementioned error: could not unlink output
file

yield.cpp:
void yield() {}

main.cpp:
void yield();
int main() { yield(); }

arm-none-eabi-g++ -o obj/main.cpp.o -c -flto -g -nostdlib -O2  main.cpp
arm-none-eabi-g++ -o obj/yield.cpp.o -c -flto -g -nostdlib -O2  yield.cpp
arm-none-eabi-gcc-ar rc obj/libFrameworkArduino.a obj/main.cpp.o
obj/yield.cpp.o
arm-none-eabi-g++ -o obj/firmware.elf -T empty.ld -Wl,--gc-sections -O2
-save-temps obj/libFrameworkArduino.a

arm-none-eabi/bin/ld.exe: error: could not unlink output file

If you remove any of the -g or -save-temps flags, or merge the code into 1
file, or use the object files directly it works.

[Bug lto/90369] error: could not unlink output file

2019-05-07 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90369

--- Comment #4 from krux  ---
The code was automatically reduced, hence the empty linker script.
Looks promising, seems like you found the cause.

[Bug debug/90441] New: [9 regression] corrupt debug info with LTO

2019-05-12 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441

Bug ID: 90441
   Summary: [9 regression] corrupt debug info with LTO
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: needs-bisection
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

Originally occurred with arm-gcc 9.1
Reproduced it with Ubuntu 19.04 gcc 9.0, works with gcc 8.3.
Couldn't reduce it further.

mk20dx128.c:
__attribute__ ((section(".vectors"), used))
_VectorsFlash[100]=
{
};

main.cpp:
void yield();
int main()
{
yield();
}

yield.cpp:
int serial3_available() {}

struct HardwareSerial3 {
int available() { serial3_available(); }
};
HardwareSerial3 Serial3;

void yield()
{
  serial3_available();
}

script.ld:
MEMORY
{
FLASH (rx) : ORIGIN = 0x, LENGTH = 4K
}

SECTIONS
{
.text : {
. = 0;
KEEP(*(.vectors))
*(.text*)
} > FLASH = 0xFF
}

gcc-9 -o mk20dx128.c.o -c -flto -g -ffunction-sections -fdata-sections
-nostdlib -O2 teensy3/mk20dx128.c
g++-9 -o main.cpp.o -c -fno-exceptions -fno-rtti -flto -g -ffunction-sections
-fdata-sections -nostdlib -O2 teensy3/main.cpp
g++-9 -o yield.cpp.o -c -fno-exceptions -fno-rtti -flto -g -ffunction-sections
-fdata-sections -nostdlib -O2 yield.cpp
g++-9 -o firmware.elf -g -T script.ld -Wl,--gc-sections,--relax -O2 main.cpp.o
mk20dx128.c.o yield.cpp.o
nm -ClS --radix=d --size-sort firmware.elf

0224 0400 T _VectorsFlashnm: DWARF error: could not
find abbrev number 8

If you remove the 'HardwareSerial3 Serial3;' line the error becomes
DWARF error: info pointer extends beyond end of attributes

[Bug debug/90441] [9 regression] corrupt debug info with LTO

2019-05-12 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441

--- Comment #1 from krux  ---
Created attachment 46343
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46343&action=edit
llvm-dwarfdump --verify output

FWIW llvm-dwarfdump --verify shows the same errors for both versions, but for
gcc-9 it can't resolve the actual strings in the DW_AT_abstract_origin lines.

[Bug debug/90441] [9 regression] corrupt debug info with LTO

2019-05-12 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441

--- Comment #2 from krux  ---
By the way, with 8.3 there is no DWARF error, but nm -l does not show any file
location for _VectorsFlash either.

[Bug driver/90443] New: -flto=n on Windows results in CreateProcess: No such file or directory

2019-05-12 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90443

Bug ID: 90443
   Summary: -flto=n on Windows results in CreateProcess: No such
file or directory
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

Just using a dummy source:
extern "C"
void _start()
{}

$ arm-none-eabi-g++ -O3 -flto=2 main.cpp -nostdlib -o firmware.elf -v
lto-wrapper.exe: fatal error: CreateProcess: No such file or directory

Very unhelpful. -v lifts the curtain:

make -f Temp\ccwaSVX1.mk -j2 all
lto-wrapper.exe: fatal error: CreateProcess: No such file or directory

There is no make, esp. in arm-gcc distributions.
The error message should be improved.

[Bug debug/90441] [9 regression] corrupt debug info with LTO

2019-05-12 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441

--- Comment #3 from krux  ---
Finally tried qemu+gdb on the original code:
gdb-8.2.1/gdb/dwarf2read.c:9715: internal-error: void
dw2_add_symbol_to_list(symbol*, pending**): Assertion `(*listhead) == NULL ||
(SYMBOL_LANGUAGE ((*listhead)->symbol[0]) == SYMBOL_LANGUAGE (symbol))' failed.

[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO

2019-05-14 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441

--- Comment #18 from krux  ---
(In reply to Iain Sandoe from comment #14)
> current trunk (27), manual regeneration of the
> firmware.elf.ltrans0.ltrans.o ->
> 
> (it's kinda frustrating that one can't see the link line, more tweaks are
> still needed to help debug LTO

Tell me about it. The first time I tried -save-temps I expected
firmware.elf.ltrans0.o to be compiled from firmware.elf.ltrans0.s of course.
But it's not, -v shows the .s file is compiled to firmware.elf.ltrans0.ltrans.o
and I still don't really know what the other one is.
Some commandlines seem to be missing (and it's hard to find them in the verbose
output, maybe some color could help) in the verbose output and the temporary
files are gone already.

[Bug lto/90523] New: lto1 segfault in arm_parse_cpu_option_name

2019-05-17 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90523

Bug ID: 90523
   Summary: lto1 segfault in arm_parse_cpu_option_name
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Built a bleeding-edge arm-gcc toolchain. It works fine but when I tried newlib
built with -flto I got a crash in lto1.

$ arm-none-eabi-g++ -o main.elf -Wl,--relax -mthumb -mcpu=cortex-m4 -O3

during IPA pass: icf
In function '__retarget_lock_acquire_recursive':
lto1: internal compiler error: Segmentation fault

#0  __strchr_avx2 () at ../sysdeps/x86_64/multiarch/strchr-avx2.S:57
#1  0x014de71a in strchr (__c=43, __s=0x0) at /usr/include/string.h:220
#2  arm_parse_cpu_option_name (list=0x1ab3400 ,
optname=optname@entry=0x18704ba "-mcpu", target=0x0,
complain=complain@entry=true)
at gcc-10-20190512/gcc/common/config/arm/arm-common.c:349
#3  0x00f8545d in arm_configure_build_target (target=0x1e7b500
, opts=0x7f3e2a00, opts_set=0x1e81100
,
warn_compatible=) at
gcc-10-20190512/gcc/config/arm/arm.c:3147
#4  0x00fa5b68 in arm_set_current_function (fndecl=) at
gcc-10-20190512/gcc/tree.h:3186
#5  0x0097da22 in invoke_set_current_function_hook
(fndecl=0x7f402400)
at gcc-10-20190512/gcc/function.c:4629
#6  0x00984a48 in invoke_set_current_function_hook
(fndecl=0x7f402400)
at gcc-10-20190512/gcc/function.c:4788
#7  allocate_struct_function (fndecl=0x7f402400, abstract_p=) at gcc-10-20190512/gcc/function.c:4742
#8  0x00afc5ed in input_function (ib_cfg=0x7ffed9c0,
ib=0x7ffed9a0, data_in=0x1f8c510, fn_decl=0x7f402400)
at gcc-10-20190512/gcc/lto-streamer-in.c:1066
#9  lto_read_body_or_constructor (file_data=0x7f3ec960, data=, node=, section_type=LTO_section_function_body)
at gcc-10-20190512/gcc/lto-streamer-in.c:1296
#10 0x0083d38b in cgraph_node::get_untransformed_body
(this=0x7f418708)
at gcc-10-20190512/gcc/cgraph.c:3570
#11 0x0144762f in ipa_icf::sem_function::init (this=0x1f61230) at
gcc-10-20190512/gcc/cgraph.h:2008
#12 0x01441d12 in
ipa_icf::sem_item_optimizer::parse_nonsingleton_classes
(this=this@entry=0x1eca870)
at gcc-10-20190512/gcc/ipa-icf.c:2776
#13 0x0144d730 in ipa_icf::sem_item_optimizer::execute (this=0x1eca870)
at gcc-10-20190512/gcc/ipa-icf.c:2577
#14 0x0144e9b7 in ipa_icf::ipa_icf_driver () at
gcc-10-20190512/gcc/ipa-icf.c:3698
#15 ipa_icf::pass_ipa_icf::execute (this=) at
gcc-10-20190512/gcc/ipa-icf.c:3745
#16 0x00b777ea in execute_one_pass (pass=0x1ec0940) at
gcc-10-20190512/gcc/passes.c:2473
#17 0x00b78517 in execute_ipa_pass_list (pass=0x1ec0940) at
gcc-10-20190512/gcc/passes.c:2913
#18 0x007ab461 in do_whole_program_analysis () at
gcc-10-20190512/gcc/context.h:48
#19 lto_main () at gcc-10-20190512/gcc/lto/lto.c:628
#20 0x00c472af in compile_file () at gcc-10-20190512/gcc/toplev.c:456
#21 0x0077b1e6 in do_compile () at gcc-10-20190512/gcc/toplev.c:2205
#22 toplev::main (this=this@entry=0x7ffedd86, argc=,
argc@entry=24, argv=, argv@entry=0x7ffede88)
at gcc-10-20190512/gcc/toplev.c:2340
#23 0x0077d9dc in main (argc=24, argv=0x7ffede88) at
gcc-10-20190512/gcc/main.c:39

I'm not sure how to reduce this.

[Bug lto/90523] lto1 segfault in arm_parse_cpu_option_name

2019-05-17 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90523

--- Comment #1 from krux  ---
So this one must be null:
https://github.com/gcc-mirror/gcc/blob/65af043/gcc/config/arm/arm.c#L3148

[Bug lto/90523] lto1 segfault in arm_parse_cpu_option_name

2019-05-17 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90523

--- Comment #3 from krux  ---
Possible, gcc was built with --disable-multilib --with-arch=armv7e-m
--with-mode=thumb --with-float=soft.
And if I replace -mcpu=cortex-m4 with -march=armv7e-m in my test command
there's no crash.

[Bug target/88013] can't vectorize rgb to grayscale conversion code

2019-05-26 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88013

--- Comment #9 from krux  ---
(In reply to ktkachov from comment #7)
> I tried current trunk (future GCC 9)
> GCC 9 learned to avoid excessive widening during vectorisation, which is
> what accounts for the large number of instructions you see.

Confirmed, the loop is now as described in comment #5 with trunk gcc.
Still with vshr+vmovn as mentioned by Ramana.

But by the way, the tail is completely unrolled, 15x the following, seems quite
excessive to me:

ldrbip, [r1, #1]@ zero_extendqisi2
movsr6, #151
ldrblr, [r1]@ zero_extendqisi2
movsr5, #77
ldrbr7, [r1, #2]@ zero_extendqisi2
movsr4, #28
smulbb  ip, ip, r6
smlabb  lr, r5, lr, ip
add ip, r3, #1
smlabb  r7, r4, r7, lr
cmp ip, r2
asr r7, r7, #8
strbr7, [r0]
bge .L1

assert(n >= 16) helps a bit, but n % 16 == 0 doesn't.

[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO

2019-05-26 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441

--- Comment #20 from krux  ---
Thanks your patch worked!

Just fyi: llvm-dwarfdump doesn't understand call-site info:
https://bugs.llvm.org/show_bug.cgi?id=41846
Not sure if it's relevant.

[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO

2019-05-26 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441

--- Comment #22 from krux  ---
I can also reproduce it without any linker script, simplified code:

int serial3_available() {}
struct HardwareSerial3 {
int available() { serial3_available(); }
};
HardwareSerial3 Serial3;

void yield() { serial3_available(); }
int main()
{
yield();
}

$ g++-9 -c -fno-exceptions -fno-rtti -flto -g -O2 main.cpp
$ g++-9 -o firmware.elf -g -O2 main.o
$ nm -ClS --radix=d --size-sort firmware.elf
4496 0001 T __libc_csu_fininm: DWARF error: could not
find abbrev number 8
00016424 0001 b completed.7374
8192 0004 R _IO_stdin_used
4160 0043 T _start
4400 0093 T __libc_csu_init


But other tools are fine in this case:
$ llvm-dwarfdump-8 --verify firmware.elf
No errors.
$ gdb firmware.elf
Reading symbols from firmware.elf...

[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO

2019-05-26 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441

--- Comment #23 from krux  ---
But it's so fragile, touch any part of the code and the error disappears.
Like change serial3_available to void and you also get an additional symbol:
4160 0003 T mainmain.cpp:8

[Bug debug/90441] [9/10 Regression] corrupt debug info with LTO

2019-05-26 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90441

--- Comment #24 from krux  ---
objdump -dCrS also prints these errors.

It definitely fails to find the entry for main which is present according to
objdump --dwarf:
 <1>: Abbrev Number: 8 (DW_TAG_subprogram)
   DW_AT_external: 1
   DW_AT_name: (indirect string, offset: 0x1ab): main
   DW_AT_decl_file   : 1
   DW_AT_decl_line   : 8
   DW_AT_decl_column : 5
   DW_AT_type: <0xc3>

[Bug c/52981] Separate -Wpadded into two options

2019-05-29 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52981

krux  changed:

   What|Removed |Added

 CC||hoganmeier at gmail dot com

--- Comment #6 from krux  ---
(In reply to Manuel López-Ibáñez from comment #4)
> This is quite easy to implement.

It's not as trivial as one might think.
There's some copy-paste code to disable the flag in various places (instead of
handling it inside if possible).


$ find -name '*.c' | xargs fgrep -nC2 warn_padded
./gcc/config/spu/spu.c-3959-  /* We know this is being padded and we want it
too.  It is an internal
./gcc/config/spu/spu.c-3960- type so hide the warnings from the user. */
./gcc/config/spu/spu.c:3961:  owp = warn_padded;
./gcc/config/spu/spu.c:3962:  warn_padded = false;
./gcc/config/spu/spu.c-3963-
./gcc/config/spu/spu.c-3964-  layout_type (record);
./gcc/config/spu/spu.c-3965-
./gcc/config/spu/spu.c:3966:  warn_padded = owp;
./gcc/config/spu/spu.c-3967-
./gcc/config/spu/spu.c-3968-  /* The correct type is an array type of one
element.  */
--
./gcc/config/tilegx/tilegx.c-340-  /* We know this is being padded and we want
it too.  It is an
./gcc/config/tilegx/tilegx.c-341- internal type so hide the warnings from
the user.  */
./gcc/config/tilegx/tilegx.c:342:  owp = warn_padded;
./gcc/config/tilegx/tilegx.c:343:  warn_padded = false;
./gcc/config/tilegx/tilegx.c-344-
./gcc/config/tilegx/tilegx.c-345-  layout_type (record);
./gcc/config/tilegx/tilegx.c-346-
./gcc/config/tilegx/tilegx.c:347:  warn_padded = owp;
./gcc/config/tilegx/tilegx.c-348-
./gcc/config/tilegx/tilegx.c-349-  /* The correct type is an array type of one
element.  */
--
./gcc/config/tilepro/tilepro.c-292-  /* We know this is being padded and we
want it too.  It is an
./gcc/config/tilepro/tilepro.c-293- internal type so hide the warnings from
the user.  */
./gcc/config/tilepro/tilepro.c:294:  owp = warn_padded;
./gcc/config/tilepro/tilepro.c:295:  warn_padded = false;
./gcc/config/tilepro/tilepro.c-296-
./gcc/config/tilepro/tilepro.c-297-  layout_type (record);
./gcc/config/tilepro/tilepro.c-298-
./gcc/config/tilepro/tilepro.c:299:  warn_padded = owp;
./gcc/config/tilepro/tilepro.c-300-
./gcc/config/tilepro/tilepro.c-301-  /* The correct type is an array type of
one element.  */
--
./gcc/fortran/trans-io.c-223-  /* -Wpadded warnings on these artificially
created structures are not
./gcc/fortran/trans-io.c-224- helpful; suppress them. */
./gcc/fortran/trans-io.c:225:  int save_warn_padded = warn_padded;
./gcc/fortran/trans-io.c:226:  warn_padded = 0;
./gcc/fortran/trans-io.c-227-  gfc_finish_type (t);
./gcc/fortran/trans-io.c:228:  warn_padded = save_warn_padded;
./gcc/fortran/trans-io.c-229-  st_parameter[ptype].type = t;
./gcc/fortran/trans-io.c-230-}
./gcc/tree-nested.c-3197-  /* In some cases the frame type will trigger the
-Wpadded warning.
./gcc/tree-nested.c-3198-This is not helpful; suppress it. */
./gcc/tree-nested.c:3199:  int save_warn_padded = warn_padded;
./gcc/tree-nested.c:3200:  warn_padded = 0;
./gcc/tree-nested.c-3201-  layout_type (root->frame_type);
./gcc/tree-nested.c:3202:  warn_padded = save_warn_padded;
./gcc/tree-nested.c-3203-  layout_decl (root->frame_decl, 0);
./gcc/tree-nested.c-3204-

[Bug c++/68901] UBSan triggers false -Wpadded warning

2019-05-29 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68901

krux  changed:

   What|Removed |Added

 CC||hoganmeier at gmail dot com

--- Comment #3 from krux  ---
Yeah the warning is for an internal data structure,
see .Lubsan_data: https://godbolt.org/z/hFo8dZ

[Bug c/52981] Separate -Wpadded into two options

2019-05-29 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52981

--- Comment #7 from krux  ---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68901
is an example of missed -Wpadded suppression.

[Bug target/87076] -mcpu/-march not propagated through LTO bytecode (ice/segfault if arch flags do not match)

2019-05-30 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87076

krux  changed:

   What|Removed |Added

 CC||hoganmeier at gmail dot com

--- Comment #5 from krux  ---
*** Bug 90523 has been marked as a duplicate of this bug. ***

[Bug target/90523] lto1 segfault in arm_parse_cpu_option_name

2019-05-30 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90523

krux  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #4 from krux  ---
The callstacks are slightly different but probably it's still a duplicate.

*** This bug has been marked as a duplicate of bug 87076 ***

[Bug middle-end/82853] Optimize x % 3 == 0 without modulo

2019-05-30 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82853

krux  changed:

   What|Removed |Added

 CC||hoganmeier at gmail dot com

--- Comment #34 from krux  ---
Also fixes the duplicate https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12849.
Can't close it though.

[Bug c++/68901] UBSan triggers false -Wpadded warning

2019-05-30 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68901

--- Comment #5 from krux  ---
Wpadded only checks for input_location != BUILTINS_LOCATION currently
(stor-layout.c).
Maybe something like !DECL_ARTIFICIAL(rli->t) should be added there.

[Bug c++/68901] UBSan triggers false -Wpadded warning

2019-05-30 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68901

--- Comment #6 from krux  ---
Created attachment 46434
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46434&action=edit
proposed patch

[Bug c++/68901] UBSan triggers false -Wpadded warning

2019-05-30 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68901

--- Comment #7 from krux  ---
Created attachment 46435
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46435&action=edit
cleanup

The previous patch should also allow removing these hacks (untested).
Though TYPE_ARTIFICIAL wasn't set in any of these cases. Is that normal?

[Bug target/87650] New: suboptimal codegen for testing low bit

2018-10-18 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87650

Bug ID: 87650
   Summary: suboptimal codegen for testing low bit
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

int pow(int x, unsigned int n)
{
int y = 1;
while (n > 1)
{
auto m = n%2;
n = n/2;
if (m)
y *= x;
x = x*x;
}
return x*y;
}

produces
mov edx, esi
and edx, 1
test edx, edx

instead of just
test sil, 1

while clang chooses a branchless version:
https://godbolt.org/z/L6VUZ1

Interestingly gcc does use test sil,1 if you get rid of m:
godbolt.org/z/9oL1oc

Assembly analysis:
https://stackoverflow.com/a/52877279/594456

[Bug tree-optimization/87913] New: max(n, 1) code generation

2018-11-06 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87913

Bug ID: 87913
   Summary: max(n, 1) code generation
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

unsigned int f(unsigned int num)
{
return num < 1 ? 1 : num;
}

int f2(int num)
{
return num < 1 ? 1 : num;
}

unsigned int g(unsigned int num)
{
return num + !num;
}

$ gcc -O3

f(unsigned int):
mov eax, edi
testedi, edi
mov edx, 1
cmove   eax, edx
f2(int):
testedi, edi
mov eax, 1
cmovg   eax, edi
g(unsigned int):
xor eax, eax
testedi, edi
seteal
add eax, edi

f and g could be:
f:  testedi, edi
mov eax, 1
cmovne  eax, edi
g:  cmp edi, 1
adc edi, 0
mov eax, edi

https://godbolt.org/z/YJWjsQ

[Bug tree-optimization/87914] New: gcc fails to vectorize bitreverse code

2018-11-06 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87914

Bug ID: 87914
   Summary: gcc fails to vectorize bitreverse code
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

$ gcc -fopenmp-simd -O3 -march=haswell -fopt-info-vec-omp-optimized-missed

template 
T reverseBits(T x)
{
unsigned int s = sizeof(x) * 8;
T mask = ~T(0);
while ((s >>= 1) > 0)
{
mask ^= (mask << s);
x = ((x >> s) & mask) | ((x << s) & ~mask); // unsupported use
in stmt
}
return x;
}

void test_reverseBits(unsigned* x)
{
#pragma omp simd aligned(x:32)
for (int i = 0; i < 16; ++i)
x[i] = reverseBits(x[i]); // couldn't vectorize loop
}

clang and icc vectorize this:
https://godbolt.org/z/ROJZGZ

[Bug tree-optimization/87915] New: emit warning if (explicit) vectorization failed

2018-11-06 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87915

Bug ID: 87915
   Summary: emit warning if (explicit) vectorization failed
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

When using #pragma omp simd for explicit vectorization shouldn't it warn if
vectorization failed?

clang has -Wpass-failed for that:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044226.html

[Bug tree-optimization/87915] emit warning if (explicit) vectorization failed

2018-11-07 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87915

--- Comment #2 from krux  ---
Yeah I'm using -fopt-info for manual performance analysis but that can't be
enabled in the normal build as it's too noisy.
Furthermore a proper warning can be turned into an error to ensure that
developer expectations are met by the compiler.

[Bug target/87913] max(n, 1) code generation

2018-11-07 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87913

--- Comment #2 from krux  ---
The case of function g is quite interesting because of the data dependencies
and adc's latency:
https://godbolt.org/z/0V8Dlx

[Bug middle-end/50481] builtin to reverse the bit order

2018-11-08 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50481

--- Comment #4 from krux  ---
+1
The builtins already produce better code than a generic bitreverse
implementation:
https://godbolt.org/z/Um2Tit

But using special hardware instructions automatically is even more important
imho.

[Bug middle-end/12849] testing divisibility by constant

2018-11-08 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12849

krux  changed:

   What|Removed |Added

 CC||hoganmeier at gmail dot com

--- Comment #5 from krux  ---
Should be fixed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82853

[Bug c++/87656] Useful flags to enable with -Wall or -Wextra

2018-11-13 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87656

krux  changed:

   What|Removed |Added

 CC||hoganmeier at gmail dot com

--- Comment #3 from krux  ---
-Wshadow, at least the local variant, would indeed be really nice in -Wall or
at least -Wextra. The global one is still too noisy because of class
constructors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78147
I always use -Wall -W -Wshadow -Wconversion -Wsign-conversion.

[Bug c++/45615] -Wshadow doesn't report class member shadowing

2018-11-13 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45615

krux  changed:

   What|Removed |Added

 CC||hoganmeier at gmail dot com

--- Comment #2 from krux  ---
Confirmed on trunk: https://godbolt.org/z/jL0ony

[Bug c++/87656] Useful flags to enable with -Wall or -Wextra

2018-11-13 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87656

--- Comment #5 from krux  ---
I meant -Wshadow=local.

[Bug target/88013] New: can't vectorize rgb to grayscale conversion code

2018-11-13 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88013

Bug ID: 88013
   Summary: can't vectorize rgb to grayscale conversion code
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

#include 

void reference_convert(uint8_t * __restrict dest, uint8_t * __restrict src, int
n)
{
  for (int i=0; ihttps://godbolt.org/z/FPG3k_

[Bug target/88013] can't vectorize rgb to grayscale conversion code

2018-11-13 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88013

--- Comment #1 from krux  ---
Something like -march=armv8-a -mfpu=neon-fp-armv8 does not work either.
https://godbolt.org/z/MpBQ0I

[Bug target/88013] can't vectorize rgb to grayscale conversion code

2018-11-14 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88013

--- Comment #3 from krux  ---
A few NEON instructions are sufficient:
https://web.archive.org/web/20170227190422/http://hilbert-space.de/?p=22

clang seems to generate similar code, see the godbolt links.

[Bug target/88013] can't vectorize rgb to grayscale conversion code

2018-11-14 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88013

--- Comment #4 from krux  ---
On x64 indeed both compilers generate a huge amount of code.
https://godbolt.org/z/TH7mqn

[Bug target/88013] can't vectorize rgb to grayscale conversion code

2018-11-14 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88013

--- Comment #6 from krux  ---
-mfloat-abi=hard was missing indeed. It's a pity there's no warning like when
trying to use the intrinsics.

Still I see a lot more instructions, maybe that got fixed after v7.2?
https://godbolt.org/z/OWzgXi

  vld3.8 {d16, d18, d20}, [r3]
  add ip, r3, #24
  add lr, lr, #1
  add r3, r3, #48
  cmp lr, r5
  vld3.8 {d17, d19, d21}, [ip]
  vmovl.u8 q5, d16
  vmovl.u8 q15, d18
  vmovl.u8 q11, d17
  vmovl.u8 q4, d19
  vmovl.u8 q0, d20
  vmovl.u8 q1, d21
  vmull.s16 q6, d10, d28
  vmull.s16 q3, d22, d28
  vmull.s16 q2, d30, d26
  vmull.s16 q11, d23, d29
  vmull.s16 q15, d31, d27
  vmull.s16 q5, d11, d29
  vmull.s16 q9, d8, d26
  vmull.s16 q8, d9, d27
  vadd.i32 q2, q6, q2
  vadd.i32 q10, q5, q15
  vadd.i32 q9, q3, q9
  vmull.s16 q15, d0, d24
  vadd.i32 q8, q11, q8
  vmull.s16 q3, d2, d24
  vmull.s16 q0, d1, d25
  vmull.s16 q1, d3, d25
  vadd.i32 q11, q2, q15
  vadd.i32 q9, q9, q3
  vadd.i32 q10, q10, q0
  vadd.i32 q8, q8, q1
  vshr.s32 q11, q11, #8
  vshr.s32 q9, q9, #8
  vshr.s32 q10, q10, #8
  vshr.s32 q8, q8, #8
  vmovn.i32 d30, q11
  vmovn.i32 d31, q10
  vmovn.i32 d20, q9
  vmovn.i32 d21, q8
  vmovn.i16 d16, q15
  vmovn.i16 d17, q10
  vst1.8 {q8}, [r4]

[Bug tree-optimization/88440] New: size optimization of memcpy-like code

2018-12-10 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440

Bug ID: 88440
   Summary: size optimization of memcpy-like code
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/RTji7B

void foo(char* restrict dst, const char* buf) {
for (int i=0; i<8; ++i)
*dst++ = *buf++;
}

$ gcc -Os
$ gcc -O2
.L2:
mov dl, BYTE PTR [rsi+rax]
mov BYTE PTR [rdi+rax], dl
inc rax
cmp rax, 8
jne .L2

$ gcc -O3
mov rax, QWORD PTR [rsi]
mov QWORD PTR [rdi], rax

$ arm-none-eabi-gcc -O3 -mthumb -mcpu=cortex-m4
ldr r3, [r1]  @ unaligned
ldr r2, [r1, #4]  @ unaligned
str r2, [r0, #4]  @ unaligned
str r3, [r0]  @ unaligned

The -O3 code is both faster and smaller for both ARM and x64:
"note: Loop 1 distributed: split to 0 loops and 1 library calls."

Should be considered for -O2 and -Os as well.

[Bug tree-optimization/88440] size optimization of memcpy-like code

2018-12-11 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440

--- Comment #3 from krux  ---
Adding -ftree-loop-distribute-patterns to -Os does not seem to make a
difference though.

[Bug c++/38658] trivial try/catch statement not eliminated

2018-12-12 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38658

krux  changed:

   What|Removed |Added

 CC||hoganmeier at gmail dot com

--- Comment #5 from krux  ---
https://godbolt.org/z/rnDy8l

[Bug debug/88534] New: internal compiler error: in tree_add_const_value_attribute, at dwarf2out.c:20246

2018-12-17 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88534

Bug ID: 88534
   Summary: internal compiler error: in
tree_add_const_value_attribute, at dwarf2out.c:20246
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

#include 
#include 
#include 

template  class basic_fixed_string
{
CharT content[N];
public: 
using char_type = CharT;

template  constexpr basic_fixed_string(const CharT
(&input)[N], std::index_sequence) noexcept: content{input[I]...} { }

constexpr basic_fixed_string(const CharT (&input)[N]) noexcept:
basic_fixed_string(input, std::make_index_sequence()) { }

constexpr size_t size() const noexcept
{
// string literals are zero terminated
if (content[N-1] == '\0')
return N - 1;
else return N;
}
constexpr CharT operator[](size_t i) const noexcept
{
return content[i];
}
constexpr const CharT * begin() const noexcept
{
return content;
}
constexpr const CharT * end() const noexcept
{
return content + size();
}
};

template  basic_fixed_string(const CharT (&)[N]) ->
basic_fixed_string;

template 
struct F
{
};

auto foo()
{
F<"test"> f;
}


# g++ -O3 -std=c++2a -g -S
:46:1: internal compiler error: in tree_add_const_value_attribute, at
dwarf2out.c:20246

[Bug debug/88534] internal compiler error: in tree_add_const_value_attribute, at dwarf2out.c:20246

2018-12-17 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88534

krux  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code

--- Comment #1 from krux  ---
https://godbolt.org/z/G-9Zqh

[Bug debug/88534] internal compiler error: in tree_add_const_value_attribute, at dwarf2out.c:20246

2018-12-17 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88534

--- Comment #2 from krux  ---
The code is based on
https://github.com/hanickadot/compile-time-regular-expressions/blob/master/include/ctll/fixed_string.hpp

[Bug c/88566] New: -Wconversion not using value range information

2018-12-20 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88566

Bug ID: 88566
   Summary: -Wconversion not using value range information
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hoganmeier at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/p0RMde

unsigned char foo(uint8_t pin)
{
if (pin >= 3 && pin <= 6) return pin - 2;
if (pin >= 9 && pin <= 10) return pin - 4;
if (pin >= 20 && pin <= 23) return pin - 13;
return 0;
}

$ gcc -O3 -Wconversion -S
:5:39: warning: conversion from 'int' to 'uint8_t' {aka 'unsigned
char'} may change value [-Wconversion]

5 |  if (pin >= 3 && pin <= 6) return pin - 2;

  |   ^~~


gcc should be aware that the value is well within the uint8_t range.

[Bug c/88566] -Wconversion not using value range information

2018-12-20 Thread hoganmeier at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88566

--- Comment #1 from krux  ---
Even simpler example:
uint8_t foo(uint8_t pin)
{
return pin > 0 ? pin - 1 : 0;
}

54 matches

Mail list logo