Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Target Milestone: ---
Compile and run following code
#include
#define __align(n) __attribute__((aligned(n)))
__attribute__((aligned(32))) static struct {
unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71264
--- Comment #17 from Bingfeng Mei ---
OK, I will skip the vectorization check on our port then. Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71264
Bingfeng Mei changed:
What|Removed |Added
CC||bmei at broadcom dot com
--- Comment #15
: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Target Milestone: ---
For the following example:
include
static int a, b;
static void bar()
{
asm volatile ("" : : : "memory");
}
void foo ()
{
a = 0;
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Target Milestone: ---
#include
static int
clamp (int x, int lo, int hi)
{
return (x < lo) ? lo : ((x > hi) ? hi : x);
}
__attribute__((noinline))
short
foo (int N)
{
short value =
clamp (N,
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Compile the following code with gcc 5.0 (
Target: x86_64-unknown-linux-gnu gcc version 5.0.0 20150226 (experimental)
[trunk revision 143368] (GCC))
~/scratch/install
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61868
Bingfeng Mei bmei at broadcom dot com changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61868
Bingfeng Mei bmei at broadcom dot com changed:
What|Removed |Added
Component|driver |lto
--- Comment
: driver
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Compile any simple file with -frandom-seed and -flto option.
#include stdio.h
extern int foo (int);
int bar (int a)
{
return a * 5;
}
int main ()
{
printf(%d\n, foo (100));
return 0
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
typedef struct
{
short real;
short imag;
} complex16_t;
void
libvector_AccSquareNorm_ref (unsigned long long *acc
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59651
--- Comment #5 from Bingfeng Mei bmei at broadcom dot com ---
Created attachment 31559
-- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31559action=edit
initial patch
Hi, Tejas, vect_create_cond_for_alias_checks contains a bug in handling
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59651
--- Comment #1 from Bingfeng Mei bmei at broadcom dot com ---
That is interesting. On x86-64, GCC does say it cannot determine dist vector
between a[3] a[b] and needs run-time aliasing test. In the end it gives up
due to too few iterations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59651
--- Comment #3 from Bingfeng Mei bmei at broadcom dot com ---
I can reproduce on aarch64. Still try to understand why. I constructed a
similar test but with positive loop step.
extern void abort (void);
int a[] = { 6, 0, 0, 0 };
int b;
int
main
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 59544, which changed state.
Bug 59544 Summary: Vectorizing store with negative step
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59544
What|Removed |Added
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59544
Bingfeng Mei bmei at broadcom dot com changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569
--- Comment #8 from Bingfeng Mei bmei at broadcom dot com ---
Sorry for the regression. The assertion happens if storing a constant value
with negative step. Doing permutation of constant is not the best optimization
here. So the easy way to fix
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59569
--- Comment #9 from Bingfeng Mei bmei at broadcom dot com ---
Seems simple patch is to just bypass permutation for constant operand as
vec_oprnd is a constant vector with identical elements.
Index: tree-vect-stmts.c
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Created attachment 31467
-- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31467action=edit
The patch against r206016
I was looking at some loops that can be vectorized by LLVM, but not GCC. One
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59249
--- Comment #4 from Bingfeng Mei bmei at broadcom dot com ---
Even I split one critical predecessor edge, predicate of BB6 is still ORed
result of two conditions from BB4 BB5. ORing two conditions results in a
sequence of statements that cannot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59249
--- Comment #3 from Bingfeng Mei bmei at broadcom dot com ---
Richard, I am not sure I understand about how to split edge.
BB4
/ \
/ \
BB5|
|\|
| \ |
| \ |
| BB6
| /
| /
BB7
Compiler
: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
I am doing some investigation on loops can be vectorized by LLVM, but not GCC.
One example is loop that contains more than one if-else
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: bmei at broadcom dot com
Created attachment 30249
-- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30249action=edit
Unvectorized with signed char type.
GCC (I used 4.7.2 x86
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57512
--- Comment #1 from Bingfeng Mei bmei at broadcom dot com ---
Created attachment 30250
-- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30250action=edit
Vectorized assembly code with unsigned char type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
--- Comment #7 from Bingfeng Mei bmei at broadcom dot com 2011-12-15 10:18:06
UTC ---
Yes, the patch fixes the bug. Thanks.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49157
Summary: Unnecessary stack save/restore code generated for a
leaf function (arm-elf-gcc)
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45416
--- Comment #8 from Bingfeng Mei bmei at broadcom dot com 2011-04-28 15:22:26
UTC ---
I am currently on vacation until 4/5/2011. I may access my mail irregularly.
Cheers,
Bingfeng Mei
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
--- Comment #5 from Bingfeng Mei bmei at broadcom dot com 2011-01-13 15:49:23
UTC ---
It works. But I have no idea about the debug info issue in your other comment.
(In reply to comment #2)
After tried patches one-by-one, I believe
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
Summary: Extra instruction generated in 4.5.2
Product: gcc
Version: 4.5.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
--- Comment #1 from Bingfeng Mei bmei at broadcom dot com 2011-01-11 13:38:13
UTC ---
Created attachment 22944
-- http://gcc.gnu.org/bugzilla/attachment.cgi?id=22944
Preprocessed test case
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47258
--- Comment #2 from Bingfeng Mei bmei at broadcom dot com 2011-01-11 16:16:28
UTC ---
After tried patches one-by-one, I believe the misoptimization is down to the
following patch.
Index: tree-ssa-copyrename.c
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45834
Bingfeng Mei bmei at broadcom dot com changed:
What|Removed |Added
CC||richard.guenther
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45834
--- Comment #3 from Bingfeng Mei bmei at broadcom dot com 2010-10-18 12:16:59
UTC ---
I think that standard specifies that char * may refer to an alias of any
object, that's why QImode is different here. But I am not sure whether a
restrict
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45834
--- Comment #5 from Bingfeng Mei bmei at broadcom dot com 2010-10-18 13:53:37
UTC ---
Sure, but we have other means of dealing with that (MEM_ALIAS_SET == 0).
Do you mean this check is redundant here ? I dig out the ancient code (from
1997
: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bmei at broadcom dot com
GCC host triplet: x86_64-unknown-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45416
--- Comment #2 from bmei at broadcom dot com 2010-08-26 12:47 ---
Sorry, I first observed this on our target. Then I tried to reproduce on x86,
but I forgot to turn on optimization flags. It does work for x86. Please delete
this report. I will figure out what happen with my target
--- Comment #3 from bmei at broadcom dot com 2010-08-26 12:55 ---
I found I can reproduce the bug with ARM
ARM trunk -Os:
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov r2
--- Comment #5 from bmei at broadcom dot com 2010-08-05 13:44 ---
I tried to apply the patches (this one alone is not enough) Richard suggested.
It becomes a chain of too many patches in the end. I am confident any more to
apply them to 4.5.
--
http://gcc.gnu.org/bugzilla
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bmei at broadcom dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45176
dot org
ReportedBy: bmei at broadcom dot com
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44365
--- Comment #8 from bmei at broadcom dot com 2010-05-24 09:31 ---
I integrated Dave's patch into LD with some modification (only emit those with
LTO sections) and hacked collect2 to support that. The size gain of LTO, our
main concern, is quite limited for our application. Large amount
--- Comment #10 from bmei at broadcom dot com 2010-05-24 13:29 ---
annotating functions with externally_visible sounds a bit difficult to
maintain. Programmer needs to know whether a function is used outside of LTO
objects. This can change over time and extra efforts are needed to keep
--- Comment #6 from bmei at broadcom dot com 2010-05-04 16:54 ---
So this is a rough first draft of the-kind-of-thing-i-was-thinking-of. We get
collect2 to run a dummy link early, and extract the output from the
--lto-assist flag to get a list of archive members that we need lto
--- Comment #12 from bmei at broadcom dot com 2010-03-09 14:20 ---
It seems that this bug still fails on my build:
~/work/install-x86/bin/gcc
/projects/firepath/tools/work/bmei/gcc-head/src/gcc/testsuite/gcc.dg/pr34668-1.c
--combine -O2
/projects/firepath/tools/work/bmei/gcc-head/src
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bmei at broadcom dot com
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43220
at broadcom dot com
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43098
--- Comment #6 from bmei at broadcom dot com 2009-05-21 08:38 ---
I only submitted small patch before. To add a pass (may need new command-line
option, disabling the old rtl-level unrolling) seems to be a big issue to me.
Don't know what's procedure.
My code also contains my own
--- Comment #4 from bmei at broadcom dot com 2009-05-20 14:17 ---
I implemented a tree-level loop-unrolling pass in our private porting, which
takes advantage of later tree ivopt pass. It produces much better code than
rtl-level loop unrolling in such scenarios. Not sure whether
Version: 4.4.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bmei at broadcom dot com
GCC target triplet: arm-elf-gcc
http://gcc.gnu.org/bugzilla
48 matches
Mail list logo