t subscribed as sell...@marvell.com
but the unsubscribe still failed. Has anyone else had this issue or have
any idea on what is going on?
Steve Ellcey
NHos80CzrFt4fiXgwrFhMWDTO9Ue_lRU&m=zJmKExSapjGitHa0CdqSuR7k0QkL_7nNpzI76Y8XSLs&s=oE8dt9sjEr5MEtYG4c_pIgGtWYh2ZH3CG1jPypnGAdg&e=
>
Ah, I see. I was hoping that using --with-advance-toolchain would give
me a way to build a toolchain without needing any local/non-standard
patches.
Steve Ellcey
sell...@marvell.com
On Thu, 2019-10-10 at 18:41 +0200, Florian Weimer wrote:
>
> * Steve Ellcey:
>
> > I would like these used by default so I took some ideas from
> > --with-advance-toolchain and used that to automatically add these options
> > to LINK_SPEC (see attached patch). I can
On Thu, 2019-10-10 at 10:49 +1030, Alan Modra wrote:
> On Wed, Oct 09, 2019 at 10:29:48PM +0000, Steve Ellcey wrote:
> > I have a question about building a toolchain that uses (at run
> > time) a
> > dynamic linker and system libraries and headers that are in a non-
> >
er (at run time) that are in a non-standard location without needing
to compile or link with special flags.
Steve Ellcey
sell...@marvell.com
Here is the patch I am trying, I use the --with-advance-toolchain option as
an absolute pathname instead of relative to /opt like IBM does and I set it
to
ps,
>
> Bill
Ah, of course, thank you. I verified that this fixes my mcf failure,
gcc is still running. I already had -fno-strict-aliasing for
perlbench, I should have figured out that it could be affecting other
tests too.
Steve Ellcey
sell...@marvell.com
ns.
Has anyone else seen these failures?
Steve Ellcey
sell...@marvell.com
m ./boost/intrusive/list.hpp:20,
from ./boost/fiber/context.hpp:29,
from libs/fiber/src/algo/algorithm.cpp:9:
Has anyone else run into this? I will try to create a cutdown test
case.
Steve Ellcey
sell...@marvell.com
Before I submit a Bugzilla report or try to cut down a test case, has any
one seen this problem when compiling the 526.blender_r benchmark from
SPEC 2017:
Compiling with '-Ofast -flto -march=native -fprofile-generate' on Aarch64:
during GIMPLE pass: vect
blender/source/blender/imbuf/intern/inde
still got a segfault). I did update my sources
though and the bug does not happen at ToT so it looks like Martin's
patch did fix my bug.
Steve Ellcey
sell...@marvell.com
I am not sure why I am only running into this with one particular
application on my Aarch64 platform. I am building it with -fopenmp,
which could have something to do with it (though there are no simd functions in
the application). The application is not that large as C++ programs go.
St
x27;. I see that
'-Ofast -ipo' resulted in everything (except libc functions) getting
inlined into the main program when using ICC. GCC did not do that, but
if I forced it to by using the always_inline attribute, GCC could
inline everything into main the way ICC does. But that did not speed
up the GCC executable.
Steve Ellcey
sell...@marvell.com
.deepsjeng_r, and
548.exchange2_r, but none are as dramatic as 519.lbm_r. Anyone have
any idea on what ICC is doing that GCC is missing? Is GCC just not
agressive enough with its inlining?
Steve Ellcey
sell...@marvell.com
I have a question about PR87763, these are aarch64 specific tests
that are failing after r265398 (combine: Do not combine moves from hard
registers).
These tests are all failing when the assembler scan looks for
specific instructions and these instructions are no longer being
generated. In some c
x27;, it matches and the test passes.
Is this intentional? It seems like if we wanted to check that it was
not tiled we sould grep for 'not tiled', not just 'tiled'. If we
want grep to see that it is tiled, then the check for tiling happening
is wrong.
Steve Ellcey
sell...@marvell.com
On Mon, 2018-11-26 at 22:47 +0100, Andreas Schwab wrote:
> External Email
>
> On Nov 26 2018, Steve Ellcey wrote:
>
> > I looked through the patches for the last couple of weeks to see if
> > I could identify
> > what changed here but I haven't found anyth
final GCC then it works.
I looked through the patches for the last couple of weeks to see if I could
identify
what changed here but I haven't found anything. Maybe it was something in
glibc that changed.
Steve Ellcey
sell...@cavium.com
On Wed, 2018-11-07 at 17:39 +, Joseph Myers wrote:
> External Email
>
> On Wed, 7 Nov 2018, Steve Ellcey wrote:
>
> >
> > I have a question about the C++ library testsuite. I built and
> > installed
> > a complete toolchain with GCC, binutils, and glibc
)
If I rerun by hand and add the --rpath, etc. flags the test works but I
am not sure why the test harness did not add them itself.
Steve Ellcey
sell...@cavium.com
On Wed, 2018-11-07 at 00:16 +0700, Arseny Solokha wrote:
>
> This is probably PR87889, already fixed on trunk.
Yup, that was the problem. I have updated my sources and things are
building now. Thanks for the info.
Steve Ellcey
I was doing some benchmarking with SPEC 2017 fprate on aarch64
(Thunderx2) and I am getting some segfaults from GCC while compiling.
I am working with delta to try and cut down one of the test cases
but I was wondering if anyone else has seen this problem. The
three tests that segfault while comp
h_calls (j, last_call_used_reg_set);
}
Steve Ellcey
sell...@cavium.com
y indexing.
Steve Ellcey
sell...@cavium.com
/* define N as 1000 - gets vectorized */
/* define N as 1 - gets vectorized */
/* define N as 10 - does not get vectorized */
#define N 10
typedef unsigned int TYPE;
void f(int *C, int *A, int val)
{
TYPE i,j;
tarted showing up on May 20th and I don't see any
bugzilla report on them. Before I try and track down what checkin caused
them and whether or not they were caused by the same checkin I thought I
would see if anyone had already done that.
Steve Ellcey
sell...@cavium.com
7;t find any mention of it in the gcc or libstdc++ mailing lists
when I looked or find any bugzilla report.
Steve Ellcey
r of bugzilla reports with examples where GCC
does not vectorize a loop. I wonder if this example is related to PR
61247.
Steve Ellcey
get_low_f64. With that change I get the
code I want/expect. I hadn't seen the __GETLOW macro in the neon
header file.
Steve Ellcey
erence,
around 20%. 521.wrf_r was more than twice as slow when compiled with
GCC instead of ICC and 503.bwaves_r and 510.parest_r also showed
significant slowdowns when compiled with GCC vs. ICC.
Steve Ellcey
sell...@cavium.com
uot;(x) : /* No clobbers */);
return result;
}
But a builtin would be cleaner.
Steve Ellcey
sell...@cavium.com
;t
it? Or is zero special?
Steve Ellcey
sell...@cavium.com
#include
#include
#include
int main()
{
double x;
x = 0.0;
printf("%e %e %e\n", x, DBL_MIN, DBL_MAX);
printf("normal is %s\n", __builtin_isnormal(x) ? "TRUE" : "FALSE&q
I was curious if there was any reason that REG_ALLOC_ORDER is not
defined for Aarch64. Has anyone tried this to see if it could help
performance? It is defined for many other platforms.
Steve Ellcey
sell...@cavium.com
d only save the
lower half).
Does this sound like something that could be used in place of your
CLOBBER_HIGH patch?
Steve Ellcey
sell...@cavium.com
On Wed, 2018-05-16 at 17:30 +0100, Richard Earnshaw (lists) wrote:
> On 16/05/18 17:21, Steve Ellcey wrote:
> >
> > It doesn't look like GCC has any existing mechanism for having different
> > sets of caller saved/callee saved registers depending on the function
> &
>
> Kind regards,
>
> Francesco
>
> [1] https://developer.arm.com/products/software-development-tools/hpc
> /arm-compiler-for-hpc/vector-function-abi
>
> >
> > Steve Ellcey
> > sell...@cavium.com
Thanks for publishing this Francesco, it looks like the main issue
[SIZE], y[SIZE];
void doit(void) { for (int i = 0; i < SIZE; i++) x[i] = sin(y[i]) + cos(y[i]); }
Which generated a sincos call, but also did not vectorize it.
Is there any way to get GCC to vectorize a loop with sincos in it?
Steve Ellcey
sell...@cavium.com
On Thu, 2018-03-22 at 11:42 -0700, H.J. Lu wrote:
> On Thu, Mar 22, 2018 at 11:08 AM, Steve Ellcey
> wrote:
> >
> > I have a question about the math vector library routines in
> > libmvec.
> > If I compile a program on x86 with -Ofast, something like:
> >
lls (without turning off -Ofast)?
Steve Ellcey
sell...@cavium.com
me that is in libmvec? Or should I put
'_ZGVbN2v_sin' in libmvec and have libgomp be dependent on libmvec? Do
I need a -mveclibabi flag for GCC if there is only one vector ABI for
Aarch64? I might still want to control whether vector functions are
called while vectorizing a loop
.
Steve Ellcey
sell...@cavium.com
_mode = TYPE_MODE (TREE_TYPE (type_in));
if (el_mode != in_mode || el_mode != DFmode)
return NULL_TREE;
if (!TYPE_VECTOR_SUBPARTS (type_out).is_constant (&n)
|| !TYPE_VECTOR_SUBPARTS (type_in).is_constant (&in_n))
return NULL_TREE;
if (n != in_n || n != 2)
return NULL_TREE;
Steve Ellcey
sell...@cavium.com
code are you copying over?
>
> Thanks,
> Richard
OK, I found the is_constant member function and used that. I was
looking at the i386 code that generates calls to libmvec. Someone
here wrote vector sin/cos functions for V2DF and I want to test them
out to see if they would work with GCC/lib
arch64-builtins.c
and it still does not compile. It works on the i386 side. It looks
like poly-int.h and poly-int-types.h are included by coretypes.h
and I include that header file so I don't understand why this isn't
compiling and what I am missing. Any help?
Steve Ellcey
sell...@cavium.com
for -frounding-math.
Steve Ellcey
sell...@cavium.com
tes the
pre-compiled header that I need for testing.
Is it excpected that GCC changes from creating a pch to creating an executable
when it see -Wl flags? Is there a flag that we can use to explicitly tell GCC
that we want to create a precompiled header in this instance?
Steve Ellcey
sell...@cavium.com
know that
char_var is stored in a register whose upper bits have already been
zeroed out somehow. In my test case the only way to know that is to
know that the load byte instruction zeroed them out.
Steve Ellcey
sell...@cavium.com
tps://gcc.gnu.org/ml/gcc-patches/2017-09/msg00929.html
Steve Ellcey
sell...@cavium.com
ode_for_size (INTVAL (op1), MODE_INT, 0).require (); /* This did
not work */
Steve Ellcey
sell...@cavium.com
1.c
So I guess there are number of questions: Are these tests worth runnning?
Do they make sense with -O3 and/or -O2 -flto? If they make sense and
should be run do we need to fix GCC to clean up the failures? Or should
we continue to just ignore them?
Steve Ellcey
sell...@cavium.com
ed in the IFUNC
resolvers instead of checking the libat_have_strexbhd variable.
Steve Ellcey
sell...@cavium.com
On Tue, 2017-06-06 at 07:50 +0200, Florian Weimer wrote:
> * Steve Ellcey:
>
> >
> > I have a question about the use of IFUNCs in libatomic. I was
> > looking at the
> > arm implementation and in gcc/libatomic/config/linux/arm/host-
> > conf
alled
when libatomic is first loaded since it is a constructor but it doesn't
seem to do anything and it isn't going to set libat_have_strexbhd as far
as I can see.
Steve Ellcey
sell...@cavium.com
ear you
are looking into that. You are obviously more knowledgable about the
GCC loop infrastructure then I am so I look forward to what you come up
with.
Steve Ellcey
sell...@cavium.com
On Sat, 2017-05-13 at 08:18 +0200, Richard Biener wrote:
> On May 12, 2017 10:42:34 PM GMT+02:00, Steve Ellcey om> wrote:
> >
> > (Short version of this email, is there a way to recalculate/rebuild
> > virtual
> > phi nodes after modifying the CFG.)
> >
>
x27;update the virtual phi
nodes' function. The non-virtual PHI nodes seem to be OK, it is just
the virtual ones that seem wrong after I duplicate the loop into two
consecutive loops.
Steve Ellcey
sell...@cavium.com
st version would have
some performance advantage since dump_enabled_p is an inlined function,
but is that enough of a reason to do it? The second version seems like
it would look cleaner in the code where we are making these calls.
Steve Ellcey
sell...@cavium.com
on wrong, the implementation wrong, or my understanding
of what the documentation is saying wrong?
Steve Ellcey
sell...@cavium.com
or this
loop? Do I need to look at the loop header and latch and see what the
header sets and what the latch checks to identify the variable?
Steve Ellcey
sell...@cavium.com
code is buggy even if it works in one case.
>
> Richard.
Should this work if I use -fno-strict-alias? Even with that option I
get different code with a zero-sized array vs. a flexible array.
I have a patch to get_ref_base_and_extent that changes the behaviour
for zero-length arrays and I will submit it after I have tested it.
Steve Ellcey
sell...@cavium.com
not sure if it should be different and, if the difference is OK,
should that affect how get_ref_base_and_extent behaves, as it apparently
does.
Steve Ellcey
sell...@cavium.com
Test case, compiling with '-O2 -DFLEX' generates different code than
'-O2 -UFLEX' on aarch64 using ToT
get anywhere
Steve Ellcey
sell...@cavium.com
On Tue, 2017-03-07 at 14:45 +0100, Michael Matz wrote:
> Hi Steve,
>
> On Mon, 6 Mar 2017, Steve Ellcey wrote:
>
> >
> > I was looking at the spec 456.hmmer benchmark and this email string
> > from Jeff Law and Micheal Matz:
> >
> > https://gcc.gn
performance
win to be had here if it can be done but the alias checking needed
seems rather extensive.
Steve Ellcey
sell...@cavium.com
void operator= ( bool bit);
operator bool() const;
};
GCC 5.4 breaks up the operator delcarations with line markers and GCC 6.2
does not.
Steve Ellcey
sell...@caviumnetworks.com
GCC to build GLIBC. Once I rebuilt GCC with threads
I could build GLIBC and not get this error.
Steve Ellcey
but it doesn't look like the second compiler is
ever run to compile anything. I am using the multi-sim dejagnu board.
Steve Ellcey
sell...@imgtec.com
ch what happens
with:
int a[256];
int main()
{
int *p = (int *)((char *)a + 1);
int *q = (int *)((char *)a + 5);
*p = *q;
return 0;
}
When I optimize it, GCC does unaligned accesses and when unoptimized
GCC does aligned accesses which will not work on MIPS.
Steve Ellcey
sell...@imgtec.com
wonder if this is a related problem.
I could not find any uses of isfinite in other C++ files (except cmath)
and the tests that use it are the same ones that are xfailed for uclibc.
Steve Ellcey
sell...@imgtec.com
causes a stall. If we used [reg] and incremented it after the
load then we would have at least one instruction in between the load and
the use and either no stall or a shorter stall.
I don't know if ivopts has anyway to do this type of analysis when
picking the IV.
Steve Ellcey
sell...@imgtec.com
in 24k.md but that didn't seem to have any
affect on the ivopts code.
Steve Ellcey
sell...@imgtec.com
*) p1;
const unsigned char *s2 = (const unsigned char *) p2;
unsigned char c1, c2;
do
{
c1 = (unsigned char) *s1++;
c2 = (unsigned char) *s2++;
if (c1 == '\0')
return c1 - c2;
}
while (c1 == c2);
return c1 - c2;
}
Steve Ellcey
sell...@imgtec.com
e change does degrade the optimizer
so could we go back to the old behaviour for C89/C99? The code in ivopts
has changed enough since the patch was applied I couldn't immediately see
how to do that in the ToT sources.
Steve Ellcey
sell...@imgtec.com
I have tried a few code changes in fixed-bit.c (to no avail) but this
code is so heavily macro-ized it is tough to figure out what it should
be doing.
Steve Ellcey
sell...@imgtec.com
t why we are doing an
lbu instead of an lb.
Steve Ellcey
sell...@imgtec.com
On Wed, 2015-10-28 at 13:42 +0100, Richard Biener wrote:
> On Wed, Oct 28, 2015 at 12:23 AM, Steve Ellcey wrote:
> >
> > I have a question about the _Fract types and their conversion routines.
> > If I compile this program:
> >
> > extern void abort (void);
> &
rsion. But 'uhq' would be a 2 byte
unsigned fract, and the unsigned fract type on MIPS should be 4 bytes
(unsigned int is 4 bytes). So shouldn't GCC have generated a call to
__satfractqiusq instead? Or am I confused?
Steve Ellcey
sell...@imgtec.com
e we doing both just to have belts and suspenders
and want to keep it that way?
Steve Ellcey
sell...@imgtec.com
don't see an
obvious patch that could have caused this new failure, has anyone else run
into this? I couldn't find anything in the bug database or in the mailing
lists.
Steve Ellcey
sell...@imgtec.com
been an issue for other targets.
Steve Ellcey
sell...@imgtec.com
ure but machine specific passes may be the
exception to that rule. We already have one pass in mips.c
(pass_mips_machine_reorg2), that might be something else that could be
broken out, though I haven't looked in detail to see what types or
structures it would need access to.
Steve Ellcey
sell...@imgtec.com
c GCC code and that has GTY types
in it so I am not sure what I need to do to get gengtype to scan
mips-private.h or if this is even possible (or wise).
Steve Ellcey
sell...@imgtec.com
nder but that seems to be the main
problem I am having with stack realignment. Getting the cfi stuff right
so that the unwinder works properly is proving very hard.
Steve Ellcey
sell...@imgtec.com
block and epilogue in order for
exception handling to work correctly. One way I thought of doing this
is to create an edge from the entry block to the exit block but I am
unsure of all the implications of creating a fake/eh/abnormal edge to
do this and which I would want to use.
Steve Ellcey
sell
On Tue, 2015-08-18 at 09:23 +0930, Alan Modra wrote:
> On Mon, Aug 17, 2015 at 10:38:22AM -0700, Steve Ellcey wrote:
> OK, then you need to emit a .cfi directive to say the frame top is
> given by the temp hard reg sometime after that assignment and before
> sp is aligned in the p
SIMPLE_IPA_PASS
and the comdats pass is just IPA_PASS. I changed mine to IPA_PASS and
it now registers the pass.
Steve Ellcey
sell...@imgtec.com
"comdats" is what is used for
the name of pass_ipa_comdats in ipa-comdats.c.
Steve Ellcey
sell...@imgtec.com
or it. I guess I need to
make the temporary register where I save $sp volatile or do something
else so that the assignment (and its associated .cfi) is not deleted by
the optimizer.
Steve Ellcey
sell...@imgtec.com
stack but I don't really understand what they are trying
to do.
Any help?
Steve Ellcey
sell...@imgtec.com
P.S. For completeness sake I have attached my current dynamic
alignment changes in case anyone wants to see them.
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 4f9a31d..386c2c
On Fri, 2015-07-10 at 14:27 -0500, Segher Boessenkool wrote:
> On Fri, Jul 10, 2015 at 10:43:43AM -0700, Steve Ellcey wrote:
> >
> > I have a basic GCC testing question. I built a native GCC and ran:
> >
> > make RUNTESTFLAGS='dg.exp' check
> >
o specify a target-board is so I can then modify it with
something like '--target-board=unix/-m32' but I think I need to specify a
board before I add any options don't I?
Steve Ellcey
sell...@imgtec.com
saved copy is used by the unwind info?
I don't see any indication that the unwind library knows if a stack has
been dynamically realigned and I don't see where unwind makes use of
this value.
Steve Ellcey
sell...@imgtec.com
t; call to use_return_register.
>
>
> r~
I ran into an interesting issue while doing this. Right now the expand
pass calls construct_exit_block (which calls expand_function_end) before
it calls expand_stack_alignment. That means that crtl->drap_reg, etc
are not yet set up when
On Fri, 2015-06-19 at 09:09 -0400, Richard Henderson wrote:
> On 06/16/2015 07:05 PM, Steve Ellcey wrote:
> >
> > I have a question about the DRAP register (used for dynamic stack alignment)
> > and about reserving/using hard registers in general. I am trying to
> >
On Fri, 2015-06-19 at 09:09 -0400, Richard Henderson wrote:
> On 06/16/2015 07:05 PM, Steve Ellcey wrote:
> >
> > I have a question about the DRAP register (used for dynamic stack alignment)
> > and about reserving/using hard registers in general. I am trying to
> >
why C++ tests with exception handling are
not working for me because this register is not getting set and restored
(since it is thought to be fixed) during code that uses throw and catch.
Steve Ellcey
sell...@imgtec.com
ey do not seem to be causing any problems during the build,
they just got me curious.
Steve Ellcey
sell...@imgtec.com
RUNTESTFLAGS='--target_board=multi-sim/--param=foo=1'
> ?
>
> Jakub
Nope, but it seems to work. That syntax is not documented in
invoke.texi. I will see about submitting a patch (or at least a
documentation bug report).
Steve Ellcey
aram value without a space in the
option? If there is I could not find it.
I tried:
export RUNTESTFLAGS='--target_board=multi-sim/--param\ foo=1'
export RUNTESTFLAGS='--target_board=multi-sim/--param/foo=1'
But neither of those worked either.
Steve Ellcey
sell...@imgtec.com
Following up to my own email, I think I found the missing magic. I
needed to set global_regs[16] to 1. Once global_regs was set for the
register, the assignment stopped getting optimized out.
Steve Ellcey
sell...@imgtec.com
On Wed, 2015-04-22 at 12:27 -0700, Steve Ellcey wrote:
> On Wed, 2
BLIC (ptr_var) = 1;
DECL_EXTERNAL (ptr_var) = 1;
DECL_REGISTER (ptr_var) = 1;
DECL_HARD_REGISTER (ptr_var) = 1;
SET_DECL_ASSEMBLER_NAME (ptr_var, id);
varpool_node::finalize_decl (ptr_var);
Then the assignment to this variable is optimized away by the cse1
optimization phase.
Steve Ellcey
sell...@imgtec.com
ot go away in my small example program but I can't figure
out what it is setting that I am not.
Steve Ellcey
sell...@imgtec.com
1 - 100 of 310 matches
Mail list logo