Re: bug in lra causes incorrect register usage / compiler crash

2014-04-29 Thread Mikael Pettersson
Paul Shortis writes:
 > I've now confirmed this same issue occurs on a stock i386 build 
 > when -fomit-frame-pointer is specified with -O2 and a test case 
 > with reasonable register pressure.

Then please open a bug report in gcc's bugzilla with the test case
and instructions on how to reproduce the error.


Could some ARM maintainer please look at PR66917?

2015-07-29 Thread Mikael Pettersson
It's a wrong-code regression in GCC 4.8 to 6.0 where it generates
NEON code with unaligned memory operands, causing alignment faults
at runtime.


how well does gcc support type-specific pointer formats?

2015-09-30 Thread Mikael Pettersson
Does gcc allow backends to have a say in how pointers are represented
(bits beyond the address), what happens in conversions between pointer
types, and what happens in conversions between pointers and uintptr_t?

The target in question has:
- one pointer format and set of load/store instructions for pointers to int/long
- another format and set of load/store instructions for pointers to char
- pointers to short use a third format in general, but can use the int/long
  format IF you know which half of the word you're going to access

What mechanisms, if any, are present in gcc to deal with this?


Re: how well does gcc support type-specific pointer formats?

2015-09-30 Thread Mikael Pettersson
Richard Biener writes:
 > On Wed, Sep 30, 2015 at 11:23 AM, Mikael Pettersson
 >  wrote:
 > > Does gcc allow backends to have a say in how pointers are represented
 > > (bits beyond the address), what happens in conversions between pointer
 > > types, and what happens in conversions between pointers and uintptr_t?
 > >
 > > The target in question has:
 > > - one pointer format and set of load/store instructions for pointers to 
 > > int/long
 > > - another format and set of load/store instructions for pointers to char
 > > - pointers to short use a third format in general, but can use the int/long
 > >   format IF you know which half of the word you're going to access
 > >
 > > What mechanisms, if any, are present in gcc to deal with this?
 > 
 > Basically none.  The only thing I could imagine you could use is have
 > the pointer
 > formats be different address-spaces.  The target controls how to
 > convert pointers
 > from/to different address-spaces.  I am not aware of specialities for
 > pointer-to-int
 > or int-to-pointer conversion though - IIRC they simply use
 > bit-identical conversions
 > (thus subregs if the modes differ).
 > 
 > But who would design this kind of weird architecture and think he could get
 > away with that easily?
 > 
 > Richard.

It's an old mainframe architecture, not a new design.

A company produced clones up until a few years ago.  They also
maintained a private gcc port based initially on gcc-3.2 and
eventually on gcc-4.3, but were unable to rebase on gcc-4.4.
I have access to that port, and am trying to figure out if it
can be reimplemented in some sane say.

/Mikael


oddities in the moxie gcc backend

2017-01-15 Thread Mikael Pettersson
I have a toy backend based on the moxie backend as a template.  During its
development I found some oddities in the moxie backend that may be bugs.

1. The REGNO_OK_FOR_INDEX_P(NUM) macro in moxie.h is:

#define REGNO_OK_FOR_INDEX_P(NUM) MOXIE_FP

Since MOXIE_FP is 0, this returns false for every register.  Should the body
be a literal 0, or some comparison between NUM and MOXIE_FP?

2. I see no actual use of MOXIE_PC or the SPECIAL_REGS register class.  Could 
they
be deleted (with adjustments for decrementing MOXIE_CC)?

3. moxie_compute_frame () doesn't take !fixed_regs[regno] into account, which 
the
related loops in moxie_expand_prologue () and moxie_expand_epilogue () do.  Bug?

There are also some minor nits:

4. The comment above `size_for_adjusting_sp' states it's used in 
expand_epilogue(),
which it isn't.

5. The "Compute this since .." comment in moxie_initial_elimination_offset () 
should
probably refer to callee_saved_reg_size not local_vars_size, to match the code.

6. There are two idential definitions of TRULY_NOOP_TRUNCATION(op,ip) in 
moxie.h.
The first one looks misplaced and should probably be deleted.


/Mikael


gcc-6-branch snapshots disabled?

2017-07-14 Thread Mikael Pettersson
Seems like the weekly snapshots from gcc-6-branch were disabled before
the 6.4.0 release but not re-enabled afterwards.  The snapshots from
trunk and gcc-{7,5}-branch are being generated as usual.


Re: gcc-6-branch snapshots disabled?

2017-07-15 Thread Mikael Pettersson
Gerald Pfeifer writes:
 > On Fri, 14 Jul 2017, Mikael Pettersson wrote:
 > > Seems like the weekly snapshots from gcc-6-branch were disabled before
 > > the 6.4.0 release but not re-enabled afterwards.  The snapshots from
 > > trunk and gcc-{7,5}-branch are being generated as usual.
 > 
 > You are right, I observed the same and just enabled snapshots
 > for the gcc-6-branch again.
 > 
 > The next snapshot should appear coming Wednesday per 
 > maintainer-scripts/crontab in the GCC repository.

Thanks


Re: ADA runtime System.Address type

2013-02-01 Thread Mikael Pettersson
BELBACHIR Selim writes:
 > Hi,
 > 
 > I'm working on a gcc/gnat port  for a private target (gcc 4.5.2, gnat 6.4.2).
 > On this target, scalar values shall be stored in $R registers whereas 
 > pointer values shall be stored in $C registers. My current ABI for 
 > procedure/function calls uses $R and $C registers depending on arguments and 
 > return values type (scalar or pointer). I need an ABI of this kind for 
 > performance reasons (the instruction set does not allow $R and $C everywhere 
 > and copying $R in $C for each procedure calls is too expensive).
 > 
 > I tested this ABI through GCC C and C++ torture suite and everything is ok 
 > (after solving special cases for implicit calls)
 > 
 > During my tests I tried to mix ADA and C sources code using the 'pragma 
 > import/export'. For example I tried to implement the "__gnat_malloc" 
 > expected by the ZFP runtime by an ADA function and 'pragma export' :
 > 
 >  function Gnat_Malloc (Size : in Integer) return System.Address is
 >  begin
 >  -- implementation
 >  end Gnat_Malloc;
 >  pragma Export (C, Gnat_Malloc, "__gnat_malloc");
 > 
 > Here is my problem :
 > * the caller of __gnat_malloc expects that return value of type pointer to 
 > be in a $C register (as defined in the ABI)
 > * the called function Gnat_Malloc return a value of type system.address in 
 > $R register because system.address is considered as a scalar value 
 > (system.ads:   type Address is mod Memory_Size;) 
 > ==> caller and callee return values doesn't match, the ABI is broken
 > 
 > I tried to modified system.ads so that system.address arguments become 
 > pointers (using access keyword) but I can't figure out how to do this 
 > because the System package is 'pragma Pure' ...
 > 
 > Is there a way to modify something somewhere (runtime, backend, frontend 
 > ...) so that arguments of type system.address are seen as pointers and not 
 > scalar values ?

This is a known issue, m68k-linux has the exact same problem with its A(ddress) 
and D(ata) registers.
See PR48835 for discussion and patches.  With those patches people are 
successfully building and using
GCC's Ada on m68k-linux.


Re: Tracking the source of an ARM miscompilation with gcc 4.6

2013-03-07 Thread Mikael Pettersson
Mike Hommey writes:
 > On Thu, Mar 07, 2013 at 01:14:03AM -0800, Andrew Pinski wrote:
 > > On Thu, Mar 7, 2013 at 12:33 AM, Mike Hommey  wrote:
 > > > On Thu, Mar 07, 2013 at 12:28:22AM -0800, Andrew Pinski wrote:
 > > >> On Thu, Mar 7, 2013 at 12:24 AM, Mike Hommey  
 > > >> wrote:
 > > >> > Hi,
 > > >> >
 > > >> > At Mozilla, we've encountered a GCC 4.6 miscompilation in the ARMv6
 > > >> > build of Firefox for Android. We'd like to evaluate whether this bug 
 > > >> > is
 > > >> > hitting us in more places than the one we spotted. To that end, we'd
 > > >> > need to know what particular bug in GCC leads to this miscompilation.
 > > >> >
 > > >> > The attached file is the preprocessed source, slightly simplified. I 
 > > >> > was
 > > >> > able to reproduce the miscompilation with both the GCC 4.6 from the
 > > >> > Android NDK r8d and 4.6.3 from Debian unstable. It apparently happens
 > > >> > for any -march, with -marm, but not -mthumb. It happens at -Os but not
 > > >> > -O2.
 > > >>
 > > >> No attached file.
 > > >
 > > > Sent it afterwards.
 > > 
 > > Still not on the list.  I bet it was too big so it was rejected by the
 > > list.  Try filing a bug report instead.
 > 
 > Let's try it gzipped. If it still doesn't work, I'll file a bug.

This test case is not self-contained.  Please file a proper bug report
with a self-contained test case.


Re: Tracking the source of an ARM miscompilation with gcc 4.6

2013-03-07 Thread Mikael Pettersson
Mike Hommey writes:
 > On Thu, Mar 07, 2013 at 11:12:40AM +0100, Mikael Pettersson wrote:
 > > Mike Hommey writes:
 > >  > On Thu, Mar 07, 2013 at 01:14:03AM -0800, Andrew Pinski wrote:
 > >  > > On Thu, Mar 7, 2013 at 12:33 AM, Mike Hommey  
 > > wrote:
 > >  > > > On Thu, Mar 07, 2013 at 12:28:22AM -0800, Andrew Pinski wrote:
 > >  > > >> On Thu, Mar 7, 2013 at 12:24 AM, Mike Hommey  
 > > wrote:
 > >  > > >> > Hi,
 > >  > > >> >
 > >  > > >> > At Mozilla, we've encountered a GCC 4.6 miscompilation in the 
 > > ARMv6
 > >  > > >> > build of Firefox for Android. We'd like to evaluate whether this 
 > > bug is
 > >  > > >> > hitting us in more places than the one we spotted. To that end, 
 > > we'd
 > >  > > >> > need to know what particular bug in GCC leads to this 
 > > miscompilation.
 > >  > > >> >
 > >  > > >> > The attached file is the preprocessed source, slightly 
 > > simplified. I was
 > >  > > >> > able to reproduce the miscompilation with both the GCC 4.6 from 
 > > the
 > >  > > >> > Android NDK r8d and 4.6.3 from Debian unstable. It apparently 
 > > happens
 > >  > > >> > for any -march, with -marm, but not -mthumb. It happens at -Os 
 > > but not
 > >  > > >> > -O2.
 > >  > > >>
 > >  > > >> No attached file.
 > >  > > >
 > >  > > > Sent it afterwards.
 > >  > > 
 > >  > > Still not on the list.  I bet it was too big so it was rejected by the
 > >  > > list.  Try filing a bug report instead.
 > >  > 
 > >  > Let's try it gzipped. If it still doesn't work, I'll file a bug.
 > > 
 > > This test case is not self-contained.  Please file a proper bug report
 > > with a self-contained test case.
 > 
 > It is, as long as you don't want to make a library or program out of it:
 > 
 > $ arm-linux-androideabi-gcc -o pkcs11.o -c -marm -Os pkcs11.i

It's not practical to verify or bisect miscompilations by manual inspection
of generated assembly or object code.  We strongly prefer runtime tests,
which requires that test cases can be compiled to executables.


Re: more 4.7 backports?

2013-03-15 Thread Mikael Pettersson
Kenny Simpson writes:
 > Last month I sent a list of bugreports that were 4.7 regressions, but had 
 > patches which fixed them for 4.8.  It looks like ~7 of these had been 
 > backported and 10 more bugreports now exist with potential for backporting 
 > as well.
...
 > 53844 - missed-optimization  (fixed back in July and waiting to see if any 
 > fallout?) - looks like it was backported, but then reverted?

The backport caused wrong-code PR56301 and was therefore reverted.

/Mikael


Re: GCC reuses stack slot in setjmp-calling function -- is it a bug or feature?

2013-04-15 Thread Mikael Pettersson
Konstantin Vladimirov writes:
 > Hi,
 > 
 > Minimal reproduction for x86:
 > 
 > #include 
 > #include 
 > #include 
 > 
 > jmp_buf something_happened;
 > int key_gdata = 2;
 > 
 > int store_data;
 > 
 > void __attribute__((noinline))
 > initiate(void)
 > {
 >   longjmp(something_happened, 1);
 > }
 > 
 > int __attribute__ ((noinline))
 > foo(int x)
 > {
 >   return x + 1;
 > }
 > 
 > int __attribute__((noinline))
 > amibuggy(int x, int y1, int y2, int y3)
 > {
 >   int nextdata, idx, key_data;
 > 
 >   key_data = key_gdata;
 > 
 >   if (setjmp(something_happened))
 > {
 >   if (key_data != key_gdata)
 > {
 >   fprintf(stderr, "Test failed: data = %d, g_data = %d\n",
 > key_data, key_gdata);
 >   abort();
 > }
 >   else
 > {
 >   fprintf(stdout, "Test passed\n");
 >   fprintf(stdout, "data = %d, g_data = %d\n", key_data, key_gdata);
 >   return 0;
 > }
 > }
 > 
 >   store_data = key_data;
 >   nextdata = key_gdata;
 > 
 >   idx = foo(nextdata);
 > 
 >   for (;idx != 0; --idx)
 > {
 >   int tmp = store_data;
 >   nextdata += foo(tmp);
 >   store_data = foo(nextdata);
 > }
 > 
 >   initiate();
 >   return 0;
 > }
 > 
 > int
 > main(void)
 > {
 >   amibuggy(9, 5, 16, 36);
 >   amibuggy(0, 0, 0, 0);
 >   return 0;
 > }
 > 
 > 
 > Compile with gcc-4.7.2 with line:
 > 
 > gcc -O2 repro.c -o repro.x86
 > 
 > Now run and you will see test "magically" failed.
 > 
 > What happens? Compiler on register allocation pass decided to reuse
 > stack slot 12(%rsp) where key_data persisted.
 > 
 > movl key_gdata(%rip), %eax
 > movl %eax, 12(%rsp)
 > call _setjmp
 > 
 > ... and later ...
 > 
 > .L10:
 > movl 12(%rsp), %edi <--- oops!
 > call foo
 > 
 > setjmp stored stack pointer, next frame slot is overwritten by
 > spill/fill (because compiler thinks that key_data is already dead at
 > this point. Then everything explodes.
 > 
 > I am not sure that it is a bug, but situation is rather complicated
 > for programmer -- I think in this case programmer really may rely on
 > context restoring after longjmp, and surprise is really great.
 > 
 > Question to gcc-help: Please comment on should I file this in
 > bugzilla, or should I live with it.
 > 
 > Question to gcc: Really I found this problem in private backend and
 > want to fix it for my users, no matter first answer is yes or no.
 > Maybe somehow restrict stack slot reusage if function calls setjmp.
 > But I am not so good in reload internals. Any advices about how to
 > locally fix it in private backend sources (I haven't still done any
 > changes to normal gcc-4.7.2, just added new machine description) are
 > very appreciated.

You need to read the C standard and understand the restrictions it
places on the values of non-volatile automatic variables when setjmp
returns due to a longjmp.  Look for "The longjmp function" in
WG14/n1124.pdf or n1494.pdf.

Then you'll need to use "volatile" or restructure your code slightly.


Re: BImode and STORE_VALUE_FLAG

2013-05-04 Thread Mikael Pettersson
On Fri, 3 May 2013 12:49:14 +, Paulo Matos  wrote:
> Hello,
> 
> It seems to me there's a bug in 
> simplify_const_relational_operation:simplify-rtx.c.
> If you set STORE_VALUE_FLAG to -1, if you get to 
> simplify_const_relational_operation
>  with code: NE, mode: BImode, op0: reg, op1: const_int 0, then you end up in 
> line
> 4717 calling get_mode_bounds.
> 
> get_mode_bounds will unfortunately return min:0, max:-1 for BImode and GCC 
> proceeds
> to compare val which is 0 using:
> /* x != y is always true for y out of range.  */
> if (val < mmin || val > mmax)
>   return const_true_rtx;
> 
> This simplifies the comparison to const_true_rtx in the case STORE_FLAG_VALUE 
> is -1.
> This seems flawed. 
> 
> Unless there's some background reason for this to happen this seems like a 
> bug. BImode
> is a two value mode: 0 or STORE_FLAG_VALUE (according to trunc_int_to_mode), 
> therefore
> there are really no bounds and these comparisons in 
> simplify_const_relational_operation
> should take special care if dealing with BImode. Also, having max < min is 
> strange at
> best and I can imagine it can result in pretty strange behaviour if a 
> developer assumes
> max >= min, as usual.
> 
> I am interested in comments to this piece of code. I am happy to patch
> simplify_const_relational_operation if you agree with what I said.
> 
> Cheers,
> 
> Paulo Matos

I can't comment on the code in question, but the backend for m68k may be 
affected
since it defines STORE_FLAG_VALUE as -1.  Do you have a testcase that would 
cause
wrong code, or a patch to cure the issue?  I'd be happy to do some testing on
m68k-linux.

/Mikael


RE: Infinite recursion due to builtin pattern detection

2013-06-27 Thread Mikael Pettersson
Paulo Matos writes:
> That explains why GCC removes the condition but the main issue of the memset 
> recursion still stands.

Known problem.  See GCC PR56888.


Re: Bug x86 backend 4.8.1?

2013-10-02 Thread Mikael Pettersson
Hendrik Greving writes:
 > gcc --version
 > gcc (GCC) 4.8.1
 > 
 > (4.7.2 seems to work)
 > 
 > gcc -g -fPIC -O2 -Wall -o test test.c
 > 
 > ./test
 > type_a is 0x0
 > 
 > Correct would be 0x14000. I don't see how the C-code could be
 > ambiguous, I think this is a bug?
 > 
 > test.c:
 > 
 > #include 
 > 
 > typedef struct
 > {
 > unsigned int a:4;
 > unsigned int b:5;
 > unsigned int c:5;
 > unsigned int d:18;
 > } my_comb_t;
 > 
 > struct my_s
 > {
 > int l;
 > };
 > 
 > my_comb_t *getps_f (struct my_s *a);
 > 
 > #define GETPS(x) getps_f(&(x))
 > 
 > my_comb_t *
 > getps_f (struct my_s *a)
 > {
 >   my_comb_t *p = (my_comb_t *) &(a->l);
 >   return p;

This is where the test case is wrong.  You need to use a union so that
gcc can see the type punning, or compile with -fno-strict-aliasing.

Without -fno-strict-aliasing I see different results for -m32 vs -m64,
but not across gcc versions; with -fno-strict-aliasing all results match.

 > }
 > 
 > int g_modes = 5;
 > 
 > int main(void)
 > {
 > int modes = g_modes;
 > int type_a = 0;
 > 
 > if (modes)
 > {
 > struct my_s m;
 > m.l = type_a;
 > my_comb_t *p = GETPS(m);
 > p->d |= modes;
 > type_a = m.l;
 > }
 > printf("type_a is 0x%x\n", type_a);
 > return 0;
 > }


Re: Change x86 prefix order

2006-12-06 Thread Mikael Pettersson
On Wed, 6 Dec 2006 09:00:30 -0800, H. J. Lu wrote:
>On Wed, Dec 06, 2006 at 08:43:17AM -0800, Randy Dunlap wrote:
>> On Tue, 5 Dec 2006 23:00:14 -0800 H. J. Lu wrote:
>> 
>> > On x86, the order of prefix SEG_PREFIX, ADDR_PREFIX, DATA_PREFIX and
>> > LOCKREP_PREFIX isn't fixed. Currently, gas generates
>> > 
>> > LOCKREP_PREFIX ADDR_PREFIX DATA_PREFIX SEG_PREFIX
>> > 
>> > I will check in a patch:
>> > 
>> > http://sourceware.org/ml/binutils/2006-12/msg00054.html
>> > 
>> > tomorrow and change gas to generate
>> > 
>> > SEG_PREFIX ADDR_PREFIX DATA_PREFIX LOCKREP_PREFIX
>> 
>> Hi,
>> Could you provide a "why" for this in addition to the
>> "what", please?
>
>LOCKREP_PREFIX is also used as SIMD prefix. DATA_PREFIX can be used as
>either SIMD prefix or data size prefix for SIMD instructions. The new
>order
>
>SEG_PREFIX ADDR_PREFIX DATA_PREFIX LOCKREP_PREFIX
>
>will make SIMD prefixes close to SIMD opcode.

That's still just "what" and doesn't explain why
this change is desirable.

Software x86 decoders clearly must handle any valid
prefix order, so they shouldn't care. (I've written one
recently. It's tedious but not rocket science.)

If hardware x86 decoders (i.e., Intel or AMD processors)
get measurably faster with the new order, that would be
a good reason to change it.

/Mikael


Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

2008-03-06 Thread Mikael Pettersson
Aurelien Jarno writes:
 > On Wed, Mar 05, 2008 at 11:58:34AM -0800, Joe Buck wrote:
 > > 
 > > Aurelien Jarno wrote:
 > > > >Since version 4.3, gcc changed its behaviour concerning the x86/x86-64 
 > > > >ABI and the direction flag, that is it now assumes that the direction 
 > > > >flag is cleared at the entry of a function and it doesn't clear once 
 > > > >more if needed.
 > > > >...
 > > > >I guess this has to be fixed on the kernel side, but also gcc-4.3 could
 > > > >revert back to the old behaviour, that is clearing the direction flag
 > > > >when entering a routine that touches it until most people are running a
 > > > >fixed kernel.
 > > 
 > > On Wed, Mar 05, 2008 at 08:00:42AM -0800, H. Peter Anvin wrote:
 > > > Linux should definitely follow the ABI.  This is a bug, and a pretty 
 > > > serious such.
 > > 
 > > Unfortunately, there are a lot of kernels out there already with this
 > > problem, and the symptoms are likely to be subtle.  So even if it is true
 > > that it is the kernel that is "in the wrong", I think we still are going
 > > to need to give users a workaround from the gcc side as well.
 > > 
 > > So I think gcc at least needs an *option* to revert to the old behavior,
 > > and there's a good argument to make it the default for now, at least for
 > > x86/x86-64 on Linux.
 > 
 > And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
 > they have the same behaviour as Linux, that is they don't clear DF
 > before calling the signal handler.

FWIW, Solaris 10 (both 32- and 64-bit) gets it right.


Re: libgcc: strange optimization

2011-08-02 Thread Mikael Pettersson
Hans-Peter Nilsson writes:
 > On Mon, 1 Aug 2011, Richard Henderson wrote:
 > 
 > > On 08/01/2011 01:30 PM, Michael Walle wrote:
 > > >  1) function inlining
 > > >  2) deferred argument evaluation
 > > >  3) because our target has no barrel shifter, (arg >> 10) is emitted as a
 > > > function call to libgcc's __ashrsi3 (_in place_!)
 > > >  4) BAM! dead code elimination optimizes r8 assignment away because calli
 > > > may clobber r1-r10 (callee saved registers on lm32).
 > >
 > > I'm afraid the only solution I can think of is to force F1 out-of-line.
 > 
 > Or another temporary - but the parameter should already have
 > that effect.

It should, but doesn't.  See PR48863 for similar breakage on ARM.

/Mikael


Re: libgcc: strange optimization

2011-08-02 Thread Mikael Pettersson
Michael Walle writes:
 > 
 > Hi,
 > 
 > > To confirm that try -fno-tree-ter.
 > 
 > "lm32-gcc -O1 -fno-tree-ter -S -c test.c" generates the following working
 > assembly code:
 > 
 > f2:
 >  addi sp, sp, -4
 >  sw   (sp+4), ra
 >  addi r2, r0, 10
 >  calli__ashrsi3
 >  addi r8, r0, 10
 >  scall
 >  lw   ra, (sp+4)
 >  addi sp, sp, 4
 >  bra

-fno-tree-ter also unbreaks the ARM test case in PR48863 comment #4.


performance regression with trunk's gengtype on ARM?

2011-08-28 Thread Mikael Pettersson
I'm seeing what appears to be a recent massive performance regression
with trunk's gengtype, as compiled and run in stage 2, on ARM V5TE.

Right now 4.7-20110827's stage2 gengtype has been running for almost
10 hours on my ARM build machine, but the process is tiny and no swapping
occurs.  To put those 10 hours in perspective, on this machine (1.6 GHz
ARM V5TE uniprocessor running Linux) I regularly do full bootstraps and
regression test suite runs for c,c++,ada,fortran in about 18 hours for
gcc 4.4, about 20 hours for gcc 4.5, about 24 hours for gcc 4.6, and
about 27 hours for trunk until recently.  So 10 hours or more just in
stage 2 gengtype is suspicious.

I believe 4.7-20110820 also was unusually slow to build, but I didn't
monitor that build very carefully so can't say if gengtype was involved
then too.

/Mikael


Re: performance regression with trunk's gengtype on ARM?

2011-08-29 Thread Mikael Pettersson
Mikael Pettersson writes:
 > I'm seeing what appears to be a recent massive performance regression
 > with trunk's gengtype, as compiled and run in stage 2, on ARM V5TE.
 > 
 > Right now 4.7-20110827's stage2 gengtype has been running for almost
 > 10 hours on my ARM build machine, but the process is tiny and no swapping
 > occurs.  To put those 10 hours in perspective, on this machine (1.6 GHz
 > ARM V5TE uniprocessor running Linux) I regularly do full bootstraps and
 > regression test suite runs for c,c++,ada,fortran in about 18 hours for
 > gcc 4.4, about 20 hours for gcc 4.5, about 24 hours for gcc 4.6, and
 > about 27 hours for trunk until recently.  So 10 hours or more just in
 > stage 2 gengtype is suspicious.
 > 
 > I believe 4.7-20110820 also was unusually slow to build, but I didn't
 > monitor that build very carefully so can't say if gengtype was involved
 > then too.

It's now been running for almost 21 hours, with no output produced at all.
According to strace it's currently not doing any syscalls, and according
to gdb it's looping in yylex().  I suspect a miscompilation.

Anyway I'm killing that build now.  I have a pile of stable branch
backports to test, after that I might try a regression hunt on trunk.


Re: performance regression with trunk's gengtype on ARM?

2011-09-07 Thread Mikael Pettersson
Mikael Pettersson writes:
 > Mikael Pettersson writes:
 >  > I'm seeing what appears to be a recent massive performance regression
 >  > with trunk's gengtype, as compiled and run in stage 2, on ARM V5TE.
 >  > 
 >  > Right now 4.7-20110827's stage2 gengtype has been running for almost
 >  > 10 hours on my ARM build machine, but the process is tiny and no swapping
 >  > occurs.  To put those 10 hours in perspective, on this machine (1.6 GHz
 >  > ARM V5TE uniprocessor running Linux) I regularly do full bootstraps and
 >  > regression test suite runs for c,c++,ada,fortran in about 18 hours for
 >  > gcc 4.4, about 20 hours for gcc 4.5, about 24 hours for gcc 4.6, and
 >  > about 27 hours for trunk until recently.  So 10 hours or more just in
 >  > stage 2 gengtype is suspicious.
 >  > 
 >  > I believe 4.7-20110820 also was unusually slow to build, but I didn't
 >  > monitor that build very carefully so can't say if gengtype was involved
 >  > then too.
 > 
 > It's now been running for almost 21 hours, with no output produced at all.
 > According to strace it's currently not doing any syscalls, and according
 > to gdb it's looping in yylex().  I suspect a miscompilation.
 > 
 > Anyway I'm killing that build now.  I have a pile of stable branch
 > backports to test, after that I might try a regression hunt on trunk.

The regression

a) was repeatable, and
b) got fixed in 4.7-20110903.

I'm not going to investigate this any further, unless I'm forced
to do a regression hunt in this date range in the future.


Re: C++ va_list wromng code generation in class::method(...,va_list args) only

2012-03-31 Thread Mikael Pettersson
Bernd Roesch writes:
 > hello
 > 
 > there is a C++ game called dunelegacy which work on other GCC architecture ok
 > On G++ 68k it compile ok, but produce wrong code, because there seem 
 > something diffrent in
 > va_list. The value of
 > SDL_RWops is transfer wrong.
 > 
 > I do not understand what backend problem is possible to cause this. Please 
 > help.va_list use the
 > builtin GCC functions
 > 
 > this happen on all 68k compilers i test. (3.x and some 4.x)
 > 
 > See also the source attached of this class

Please read  and then enter a bug report in 
gcc's bugzilla.

The source attached here is too incomplete to be useful.


Re: GNU MPC 1.0 release candidate

2012-07-14 Thread Mikael Pettersson
Andreas Enge writes:
 > We are pleased to announce the immediate availability of the first release
 > candidate for GNU MPC 1.0 at
 >http://www.multiprecision.org/mpc/download/mpc-1.0.0rc1.tar.gz
 >sha1sum 9acc8a54ba4ecd0ccf172c0d07fcc218220e79a3
 > 
 > Reports on successful installations and potential problems are very welcome;
 > please include the configuration triple of your platform, and the gcc, gmp
 > and mpfr versions used (these are output at the end of 'make check').
 > 
 > The status of successful and failed builds can be seen at
 >http://www.multiprecision.org/index.php?prog=mpc&page=platforms
 > 
 > In particular, we need to compile on the primary and secondary gcc platforms
 > before the release.
 > 
 > Thank you very much for your help,
 > 
 > Andreas

Testing on sparc-sun-solaris2.10:

GMP: include 5.0.5, lib 5.0.5
MPFR: include 3.1.1, lib 3.1.1
MPC: include 1.0.0rc1, lib 1.0.0rc1
C compiler: gcc
GCC: yes
GCC version: 4.7.1
PASS: tget_version
===
All 64 tests passed
===


Re: GNU MPC 1.0 release candidate

2012-07-14 Thread Mikael Pettersson
Andreas Enge writes:
 > We are pleased to announce the immediate availability of the first release
 > candidate for GNU MPC 1.0 at
 >http://www.multiprecision.org/mpc/download/mpc-1.0.0rc1.tar.gz
 >sha1sum 9acc8a54ba4ecd0ccf172c0d07fcc218220e79a3
 > 
 > Reports on successful installations and potential problems are very welcome;
 > please include the configuration triple of your platform, and the gcc, gmp
 > and mpfr versions used (these are output at the end of 'make check').
 > 
 > The status of successful and failed builds can be seen at
 >http://www.multiprecision.org/index.php?prog=mpc&page=platforms
 > 
 > In particular, we need to compile on the primary and secondary gcc platforms
 > before the release.
 > 
 > Thank you very much for your help,
 > 
 > Andreas

Testing on m68k-unknown-linux-gnu:

GMP: include 5.0.4, lib 5.0.4
MPFR: include 3.1.0-p8, lib 3.1.0-p8
MPC: include 1.0.0rc1, lib 1.0.0rc1
C compiler: gcc
GCC: yes
GCC version: 4.6.3
PASS: tget_version
===
All 64 tests passed
===

Note:
This is an emulated M68040 system, you may want to put "ARAnyM 0.9.13"
in the "Comment" field.


Re: Anyone else run ACATS on ARM?

2009-08-17 Thread Mikael Pettersson
On Wed, 12 Aug 2009 23:08:00 +0200, Matthias Klose  wrote:
>On 12.08.2009 23:07, Martin Guy wrote:
>> On 8/12/09, Joel Sherrill  wrote:
>>>   So any ACATS results from any other ARM target would be
>>>   appreciated.
>>
>> I looked into gnat-arm for the new Debian port and the conclusion was
>> that it has never been bootstrapped onto ARM. The closest I have seen
>> is Adacore's GNATPro x86->xscale cross-compiler hosted on Windows and
>> targetting Nucleus OS (gak!)
>>
>> The community feeling was that it would "just go" given a prodigal
>> burst of cross-compiling, but I never got achieved sufficiently high
>> blood pressure to try it...
>
>is there any arm-linx-gnueabi gnat binary that could be used to bootstrap an 
>initial gnat-4.4 package for debian?
 > 
 >Matthias

Yes, see .

There you'll find a native --enable-languages=c,ada gcc-4.4.1 installation
for armv5tel-linux-gnueabi, and a patch file making the relatively minor
changes to gcc-4.4.1 needed for this. I'll also upload a big-endian
armv5teb-linux-gnueabi version once it's finished its final rebuild.

Notes:
- Built from vanilla gcc-4.4.1 sources with only the arm ada patch applied.
- Built with --with-arch=armv5te --prefix=/tmp/gcc-4.4.1-install, glibc-2.7,
  gmp-4.2.4, mpfr-2.4.1, and binutils-2.19.1.
- Bootstrap went as follows:
  (on i686-linux)
  1. I wrote an arm ada support patch for gcc-4.3.4.
  2. Built i686->arm cross 4.3.4.
  3. Used i686->arm cross to build a crossed native arm->arm on i686.
  (on armv5tel-linux-gnueabi)
  4. Used the crossed native arm->arm 4.3.4 to build a native on arm.
 This worked but generated tons of alignment faults the kernel had
 to trap and emulate.
  5. My gcc-4.3.4 is heavily updated with backported fixes. I used it to
 build a vanilla 4.3.4 with ada but that one failed to build itself.
  6. Quickly ported the 4.3.4 patch to 4.4.1, then used the heavily updated
 4.3.4 to build a vanilla 4.4.1 with ada.
  7. Used the 4.4.1 compiler to rebuild itself with a different --prefix.
 This step appears to not have suffered from emulated alignment faults.
 Not sure if that's due to 4.4 vs 4.3 or because the crossed native
 compiler was tainted by having been built on i686.
  8. The final 4.4.1 is what I uploaded.
- The patch includes a change to eliminate pointless use of exceptions
  in xsinfo.adb. That was needed for 4.3 and does no harm in 4.4, but I
  have not checked if 4.4 actually needs it.
- Test suite has not been run.

I did a similar bootstrap of ada for gcc-4.1.2 and ARM OABI several years ago,
so I had a rough idea on how to proceed. Especially step 3 is complicated
because the ada makefiles are utterly broken for the crossed native build case,
so lots of manual intervention is required there.

(The OABI ada compiler didn't really work however, and required invasive
hacks to avoid complex constructs that it would miscompile. I suspect the
ada compiler was incompatible with the OABI structure alignment rules.)

/Mikael


web interface to repo just got decidedly worse

2009-08-19 Thread Mikael Pettersson
When browsing e.g. gcc-cvs via the web it used to be possible to click on a
newly added file and get a 'download raw' (I think it was called) option to
see the file without all that idiotic html formatting. That seems to be gone
now. For me, at least, this is extremely counterproductive.


Re: web interface to repo just got decidedly worse

2009-08-20 Thread Mikael Pettersson
Ian Lance Taylor writes:
 > Mikael Pettersson  writes:
 > 
 > > When browsing e.g. gcc-cvs via the web it used to be possible to click on a
 > > newly added file and get a 'download raw' (I think it was called) option to
 > > see the file without all that idiotic html formatting. That seems to be 
 > > gone
 > > now. For me, at least, this is extremely counterproductive.
 > 
 > Fixed.

Yep, now it works fine.  Thanks!

/Mikael


Re: Anyone else run ACATS on ARM?

2009-08-23 Thread Mikael Pettersson
Ludovic Brenta writes:
 > Mikael Pettersson  writes:
 > > On Wed, 12 Aug 2009 23:08:00 +0200, Matthias Klose  wrote:
 > >>is there any arm-linx-gnueabi gnat binary that could be used to bootstrap 
 > >>an 
 > >>initial gnat-4.4 package for debian?
 > >
 > > Yes, see <http://user.it.uu.se/~mikpe/linux/arm-eabi-ada/>.
 > 
 > Mikael, thanks for this outstanding work! Matthias has now added your
 > patch to Debian and I'm about to upload gnat-4.4 with it. However I
 > cannot help but notice that the new file you introduced,
 > system-linux-armeb.ads, has the GPLv2 or later with special exception
 > (aka "GNAT-Modified GPL") boilerplate in it.  I suggest that a future
 > version of this patch migrate to GPLv3 or later with the run-time
 > exception.

The system-linux-arme{l,b}.ads files have the GPLv2 boilerplate because
they were initially written for gcc-4.3, starting out as clones of the
gcc-4.3 system-linux-ppc.ads file. I just forgot to update the boilerplate
when forward-porting the patch to gcc-4.4. I've now uploaded an incremental
patch to update the boilerplate to match gcc-4.4, so it now mentions GPLv3
instead.

/Mikael


Re: [Ada] Anyone else run ACATS on ARM?

2009-08-29 Thread Mikael Pettersson
Laurent GUERBY writes:
 > On Sat, 2009-08-22 at 23:33 +0200, Laurent GUERBY wrote:
 > > On Mon, 2009-08-17 at 12:00 +0200, Mikael Pettersson wrote:
 > > > On Wed, 12 Aug 2009 23:08:00 +0200, Matthias Klose  
 > > > wrote:
 > > > >On 12.08.2009 23:07, Martin Guy wrote:
 > > > >> On 8/12/09, Joel Sherrill  wrote:
 > > > >>>   So any ACATS results from any other ARM target would be
 > > > >>>   appreciated.
 > > > >>
 > > > >> I looked into gnat-arm for the new Debian port and the conclusion was
 > > > >> that it has never been bootstrapped onto ARM. The closest I have seen
 > > > >> is Adacore's GNATPro x86->xscale cross-compiler hosted on Windows and
 > > > >> targetting Nucleus OS (gak!)
 > > > >>
 > > > >> The community feeling was that it would "just go" given a prodigal
 > > > >> burst of cross-compiling, but I never got achieved sufficiently high
 > > > >> blood pressure to try it...
 > > > >
 > > > >is there any arm-linx-gnueabi gnat binary that could be used to 
 > > > >bootstrap an 
 > > > >initial gnat-4.4 package for debian?
 > > >  > 
 > > >  >Matthias
 > > > 
 > > > Yes, see <http://user.it.uu.se/~mikpe/linux/arm-eabi-ada/>.
 > > 
 > > Nice work!
 > > 
 > > Looks like Ada exception propagation (setjmp/longjmp based) is broken at
 > > least in some cases (see below), that might explain the high number of
 > > ACATS failure.
 > > 
 > > My understanding is that
 > > 
 > >   EH_MECHANISM=-gcc
 > > 
 > > is not correct for sjlj exceptions so I removed this line from the patch
 > > and I'm currently testing with trunk.
 > 
 > With this change plus the gcc/ada/gcc-interface/targtyps.c one
 > I get very good native arm ACATS and gnat.dg results:

Thanks Laurent. I've extracted an incremental diff for the Makefile
and targtyps.c changes and applied it to my gcc-4.4.1 version. When
it has finished its rebuild I'll see if this lets me eliminate the
xsinfo.adb exception handling workaround.

 > It would be nice to have a ZCX port but so far the sjlj exceptions
 > based port works fine.

See e.g. <http://gcc.gnu.org/ml/gcc-patches/2009-02/msg00509.html>.
Apparently ZCX on ARM/EABI will require a new personality routine.

/Mikael


Re: [Ada] Anyone else run ACATS on ARM?

2009-08-31 Thread Mikael Pettersson
Mikael Pettersson writes:
 > Laurent GUERBY writes:
 >  > On Sat, 2009-08-22 at 23:33 +0200, Laurent GUERBY wrote:
 >  > > On Mon, 2009-08-17 at 12:00 +0200, Mikael Pettersson wrote:
 >  > > > On Wed, 12 Aug 2009 23:08:00 +0200, Matthias Klose  
 > wrote:
 >  > > > >On 12.08.2009 23:07, Martin Guy wrote:
 >  > > > >> On 8/12/09, Joel Sherrill  wrote:
 >  > > > >>>   So any ACATS results from any other ARM target would be
 >  > > > >>>   appreciated.
 >  > > > >>
 >  > > > >> I looked into gnat-arm for the new Debian port and the conclusion 
 > was
 >  > > > >> that it has never been bootstrapped onto ARM. The closest I have 
 > seen
 >  > > > >> is Adacore's GNATPro x86->xscale cross-compiler hosted on Windows 
 > and
 >  > > > >> targetting Nucleus OS (gak!)
 >  > > > >>
 >  > > > >> The community feeling was that it would "just go" given a prodigal
 >  > > > >> burst of cross-compiling, but I never got achieved sufficiently 
 > high
 >  > > > >> blood pressure to try it...
 >  > > > >
 >  > > > >is there any arm-linx-gnueabi gnat binary that could be used to 
 > bootstrap an 
 >  > > > >initial gnat-4.4 package for debian?
 >  > > >  > 
 >  > > >  >Matthias
 >  > > > 
 >  > > > Yes, see <http://user.it.uu.se/~mikpe/linux/arm-eabi-ada/>.
 >  > > 
 >  > > Nice work!
 >  > > 
 >  > > Looks like Ada exception propagation (setjmp/longjmp based) is broken at
 >  > > least in some cases (see below), that might explain the high number of
 >  > > ACATS failure.
 >  > > 
 >  > > My understanding is that
 >  > > 
 >  > >   EH_MECHANISM=-gcc
 >  > > 
 >  > > is not correct for sjlj exceptions so I removed this line from the patch
 >  > > and I'm currently testing with trunk.
 >  > 
 >  > With this change plus the gcc/ada/gcc-interface/targtyps.c one
 >  > I get very good native arm ACATS and gnat.dg results:
 > 
 > Thanks Laurent. I've extracted an incremental diff for the Makefile
 > and targtyps.c changes and applied it to my gcc-4.4.1 version. When
 > it has finished its rebuild I'll see if this lets me eliminate the
 > xsinfo.adb exception handling workaround.

It did, so exceptions do seem to be working now. I've uploaded an
updated bootstrap compiler for armel and two new patches. The patch
kit now is:

gcc-4.4.1-arm-eabi-ada-1.patch: initial patch
gcc-4.4.1-arm-eabi-ada-2.patch: update copyright and licence text
gcc-4.4.1-arm-eabi-ada-3.patch: make SJLJ exceptions work, fix a warning
gcc-4.4.1-arm-eabi-ada-4.patch: revert xsinfo.adb dont-use-exceptions kludge

For those with a compiler based on the initial patch only, apply patches
2 and 3 first, rebuild, then apply patch 4.

/Mikael


Re: MPC version 0.8 released!

2009-11-05 Thread Mikael Pettersson
Kaveh R. Ghazi writes:
 > From: "Dennis Clarke" 
 > 
 > >>> > target GCC GMP MPFR
 > >>> >
 > >>> > sparc-sun-solaris2.11 4.1.1 4.2.1 2.3.2
 > >>> > i386-pc-solaris2.10 4.1.1 4.2.1 2.3.2
 > >>> > mips-sgi-irix6.5 3.4.5 4.3.0 2.3.2
 > >>> > alpha-dec-osf4.0f 3.4.4 4.2.1 2.3.2
 > >>> >
 > >>> > All tests passed everywhere.
 > >>>
 > >>> what about sparc-sun-solaris2.10 ? sparc-sun-solaris2.9 and 2.8 ?
 > 
 > I've done sparc-sun-solaris2.9, it passed.  See:
 > http://lists.gforge.inria.fr/pipermail/mpc-discuss/2009-November/000608.html
 > 
 > Since sparc-sun-solaris2.10 is our official "primary platform" solaris for 
 > gcc-4.5, it would be nice to have that checked as well.
 > http://gcc.gnu.org/gcc-4.5/criteria.html
 > 
 > Though I don't foresee any problems, if either of you have that box and can 
 > find time to run the test and report back, I would very much appreciate it.

sparc-sun-solaris2.10, gcc-4.4.2, gmp-4.3.1, mpfr-2.4.1, mpc-0.8
All 57 tests passed


Re: Changing the ABI

2010-01-19 Thread Mikael Pettersson
On Mon, 18 Jan 2010 13:55:16 -0500, Jean Christophe Beyler wrote:
>I have a current issue on my port. Basically the stack is defined as follows :
>
>1) CALL_USED save area
>2) Local variables
>3) Caller arguments passed on the stack
>4) 8 words to save arguments passed in registers, even if not passed
>
>Now, this was done because we have defined 8 output registers,
>therefore zone (3) is not used except if we call a function with more
>than 8 parameters.
>
>(4) is only used if we have va_arg functions that will then spill the
>8 input registers on the stack and call the va_arg function.
>
>This is done, to my understanding for our ABI, because, in the case of
>a va_arg function, we want the parameters consecutively store in
>memory.

That's indeed a desirable property.

>However, this brings me to 2 problems :
>
>1) If there are no va_arg function calls, there still is 8 words
>wasted on the stack per function call still active on the stack.
>
>2) If there is an alloca, then GCC moves the stack pointer but without
>trying to get back those 8 words or even the space for (3).
>
>
>I am currently working on not having that wasted space. I see two options:
>
>1) Change the ABI to go closer to what I have seen in the MIPS port :
>- The zone (4) is handled by the callee
>- However, I'll still have the wasted space (4) when an alloca is used no ?

Actually, zone (4) should just be deleted entirely from the ABI.
That is, all the ABI should specify is that non-register arguments
are in the caller's frame starting exactly at the stack pointer.

For non-varargs calls, there's no waste.

For varargs calls, the callee can allocate its frame and save the
incoming register arguments at the top of its own frame, establishing
the all-arguments-are-stored-consecutively property without burdening
the caller or the ABI.

An alloca() in the callee should just adjust the stack pointer and
ignore the register arguments save area at the top of the frame.

/Mikael


Re: (un)aligned accesses on x86 platform.

2010-03-17 Thread Mikael Pettersson
On Tue, 16 Mar 2010 06:50:30 -0800,  H.J. Lu  wrote:
> 2010/3/8 Pawe=C5=82 Sikora :
> > hi,
> >
> > during development a cross platform appliacation on x86 workstation
> > i've enabled an alignemnt checking [1] to catch possible erroneous
> > code before it appears on client's sparc/arm cpu with sigbus ;)
> >
> > it works pretty fine and catches alignment violations but Jakub Jelinek
> > had told me (on glibc bugzilla) that gcc on x86 can still dereference
> > an unaligned pointer (except for vector insns).
> > i suppose it means that gcc can emit e.g. movl for access a short int
> > (or maybe others scenarios) in some cases and violates cpu alignment rule=
> s.
> >
> > so, is it possible to instruct gcc-x86 to always use suitable loads/stores
> > like on sparc/arm?
> >
> > [1] "AC" bit - http://en.wikipedia.org/wiki/FLAGS_register_(computing)
> >
> 
> I am interested in an -mstrict-alignment option for x86.

Me too. One use I have in mind is in emulators for ISAs that require
alignment checks. Setting the AC bit would allow the emulator to replace
explicit alignment checks with the x86 host's alignment checks, which
could speed up emulation and reduce code volume (for JITs).

However, this relies on gcc and glibc to not generate too many misaligned
accesses. The glibc bit can be worked around (use a replacement libc or
offload libc accesses to a sibling thread that runs with AC=0), but gcc
does need to be instructed to not generate code with misaligned accesses.

Another use is in implementations of dynamically-typed languages like Lisp.
You can choose a tagging scheme so that CAR and CDR become simple loads with
appropriate offsets. If the tagged pointer happens to not be a CONS, you'll
get an alignment exception.

/Mikael


Re: What is the best way to resolve ARM alignment issues for large modules?

2010-05-08 Thread Mikael Pettersson
Shaun Pinney writes:
 > Hello all,
 > 
 > Essentially, we have code which works fine on x86/PowerPC but fails on ARM 
 > due
 > to differences in how misaligned accesses are handled.  The failures occur in
 > multiple large modules developed outside of our team and we need to find a
 > solution.  The best question to sum this up is, how can we use the compiler 
 > to
 > arrive at a complete solution to quickly identify all code locations which
 > generate misaligned accesses and/or prevent the compiler from generating
 > misaligned accesses?  Thanks for any advice.  I'll go into more detail below.
 > 
 > ---
 > We're using an ARM9 core (ARMv5) and notice that GCC generates misaligned 
 > load
 > instructions for certain modules in our platform.  For these modules, which 
 > work
 > correctly on x86/PowerPC, the misaligned loads causes failures.  This is 
 > because
 > the ARM rounds down misaligned addresses to the correct alignment, performs 
 > the
 > memory load, and rotates the data before placing in a register.  As a 
 > result, a
 > misaligned multi-byte load instruction on ARM actually loads memory below the
 > requested address and does not load all upper bytes from "address" to 
 > "address +
 > size - 1" so it appears to these modules as incorrect data.  On x86/PowerPC,
 > loads do provide bytes from "address" to "address + size - 1" regardless of
 > alignment, so there are no problems.
 > 
 > Fixing the code manually for ARM alignment has difficulties.  Due to the 
 > large
 > code volume of these external modules, it is difficult to identify all 
 > locations
 > which may be affected by misaligned accesses so the code can be rewritten.
 > Currently, the only way to detect these issues is to use -Wcast-align and 
 > view
 > the output to get a list of potential alignment issues.  This appears to 
 > list a
 > large number of false positives so sorting through and doing code 
 > investigation
 > to locate true problems looks very time-consuming.  On the runtime side, 
 > we've
 > enabled alignment exceptions to catch some additional cases, but the problem 
 > is
 > that exceptions are only thrown for running code.  There is always the chance
 > there is some more unexecuted 'hidden' code waiting to fail when the right
 > circumstance occurs.  I'd like to provably remove the problem entirely and
 > quickly.
 > 
 > One idea, to guarantee no load/store alignment problems will affect our 
 > product,
 > was to force the compiler to generate single byte load/store instructions in
 > place of multi byte load/store instructions when the alignment cannot be
 > verified by the compiler.  Such as, for pointer typecasts where the 
 > alignment is
 > increased (e.g. char * to int *), accesses to misaligned fields of packed 
 > data
 > structures, accesses to structure fields not allocated on the stack, etc.  Is
 > this available?  Obviously, this will add performance overhead, but would
 > clearly resolve the issue for affected modules.
 > 
 > Does the ARM compiler provide any other techniques to help with these types 
 > of
 > problems?  It'd be very helpful to find a fast and complete way to do this 
 > work.
 > Thanks!
 > 
 > Thanks again for your advice.
 > 
 > Best regards,
 > Shaun
 > 
 > BTW - our ARM also allows us to change the behavior of multi-byte load/store
 > instructions so they read from 'address' to 'address + size - 1'.  However, 
 > our
 > OS, indicates that it intentionally uses misaligned loads/stores, so changing
 > the ARM's load/store behavior to fix the module alignment problems would 
 > break
 > the OS in unknown places.  Also, because of this we cannot permanently enable
 > alignment exceptions either.  I plan to discuss this more with our OS vendor.

You don't name the platform OS but the obvious solution (to me anyway) is to run
the code on ARM/Linux. On that platform you can instruct the kernel to take 
various
actions on alignment faults. In particular, by

> echo 5 > /proc/cpu/alignment

you tell the kernel to log misalignment traps and then kill the offending 
process.

So you:

1. Run the application. It gets killed.
2. Retrieve the fault PC from the kernel message log.
3. Map it back to the application source. Fix the problem or add debugging code.
4. Repeat from step 1 until all alignment faults have been eliminated.

You can also instruct the kernel to (correctly) handle and emulate misaligned
loads/stores without killing the process. That allows you to run the code 
correctly,
though the fault handling will induce some performance overhead.

If you can't run Linux on your target HW then you could do the debugging in an
ARM emulator such as QEMU.


4.3 weekly snapshots bot broken?

2009-06-17 Thread Mikael Pettersson
It seems the bot or whatever that generates the weekly snapshots
has stopped working for the 4.3 branch. I would have expected a
new snapshot 2-3 days ago but found nothing on the mirrors.
(And there has been commits since the last snapshot.)

/Mikael


[PATCH,libstdc++] unwind-cxx.h: correct prototypes for ARM EH routines (PR libstdc++/44902)

2010-07-20 Thread Mikael Pettersson
The prototypes for two ARM EH routines don't match their actual
definitions in eh_arm.cc, resulting in build-time warnings.  When
-Werror is active, the build fails.  See PR44902.

Fixed simply by updating the prototypes to match the definitions.

Tested with crosses to arm-eabi and arm-linux-gnueabi, and with
a native bootstrap and regtest on arm-linux-gnueabi.

Ok for 4.6? 4.5? (I don't have svn write access.)

libstdc++-v3/

2010-07-20  Mikael Pettersson  

PR libstdc++/44902
* libsupc++/unwind-cxx.h (__cxa_type_match): Correct prototype.
(__cxa_begin_cleanup): Likewise.

--- gcc-4.6-20100717/libstdc++-v3/libsupc++/unwind-cxx.h.~1~2009-05-03 
18:51:50.0 +0200
+++ gcc-4.6-20100717/libstdc++-v3/libsupc++/unwind-cxx.h2010-07-20 
11:18:42.0 +0200
@@ -196,9 +196,9 @@ typedef enum {
   ctm_succeeded = 1,
   ctm_succeeded_with_ptr_to_base = 2
 } __cxa_type_match_result;
-extern "C" bool __cxa_type_match(_Unwind_Exception*, const std::type_info*,
-bool, void**);
-extern "C" void __cxa_begin_cleanup (_Unwind_Exception*);
+extern "C" __cxa_type_match_result __cxa_type_match(_Unwind_Exception*, const 
std::type_info*,
+   bool, void**);
+extern "C" bool __cxa_begin_cleanup (_Unwind_Exception*);
 extern "C" void __cxa_end_cleanup (void);
 #endif
 


Re: [PATCH,libstdc++] unwind-cxx.h: correct prototypes for ARM EH routines (PR libstdc++/44902)

2010-07-20 Thread Mikael Pettersson
Mikael Pettersson writes:
 > The prototypes for two ARM EH routines don't match their actual
 > definitions in eh_arm.cc, resulting in build-time warnings.  When
 > -Werror is active, the build fails.  See PR44902.
 > 
 > Fixed simply by updating the prototypes to match the definitions.
 > 
 > Tested with crosses to arm-eabi and arm-linux-gnueabi, and with
 > a native bootstrap and regtest on arm-linux-gnueabi.
 > 
 > Ok for 4.6? 4.5? (I don't have svn write access.)
 > 
 > libstdc++-v3/
 > 
 > 2010-07-20  Mikael Pettersson  
 > 
 >  PR libstdc++/44902
 >  * libsupc++/unwind-cxx.h (__cxa_type_match): Correct prototype.
 >  (__cxa_begin_cleanup): Likewise.

Please ignore, I mistakenly cc:d gcc not gcc-patches.


 > 
 > --- gcc-4.6-20100717/libstdc++-v3/libsupc++/unwind-cxx.h.~1~ 2009-05-03 
 > 18:51:50.0 +0200
 > +++ gcc-4.6-20100717/libstdc++-v3/libsupc++/unwind-cxx.h 2010-07-20 
 > 11:18:42.0 +0200
 > @@ -196,9 +196,9 @@ typedef enum {
 >ctm_succeeded = 1,
 >ctm_succeeded_with_ptr_to_base = 2
 >  } __cxa_type_match_result;
 > -extern "C" bool __cxa_type_match(_Unwind_Exception*, const std::type_info*,
 > - bool, void**);
 > -extern "C" void __cxa_begin_cleanup (_Unwind_Exception*);
 > +extern "C" __cxa_type_match_result __cxa_type_match(_Unwind_Exception*, 
 > const std::type_info*,
 > +bool, void**);
 > +extern "C" bool __cxa_begin_cleanup (_Unwind_Exception*);
 >  extern "C" void __cxa_end_cleanup (void);
 >  #endif
 >  


Re: Triplet for ARM Linux HardFP ABI, again

2011-02-22 Thread Mikael Pettersson
Guillem Jover writes:
 > On Mon, 2011-02-21 at 17:59:06 +, Joseph S. Myers wrote:
 > > On Mon, 21 Feb 2011, Guillem Jover wrote:
 > > > if you'd consider accepting something ressembling the attached patch
 > > 
 > > A pre-existing condition, but in general where the code you're changing 
 > > hardcodes "gnu" that's wrong - arm*-*-linux-uclibceabi is also meant to be 
 > > valid.  So if you allow a suffix here, the general form to accept 
 > > consistently would be arm*-*-linux-*eabi*.
 > 
 > Ok, so something like the attached then (again completely untested)?
 > 
 > I've changed the ada part to just match on arm% linux% in the same way
 > the other targets do, as there didn't seem anything GNU EABI specific
 > in commit 8f0372dd2b828c0a0ee05dee4496a021da9cee40 (r155808).

Incorrect, the ARM Ada support (which I contributed) is emphatically
only for linux-gnueabi.  Ada on OABI is known to have non-trivial
problems (or did last time I bootstrapped it before gcc-4.4), so that
combination is unsupported.  Besides, OABI is obsolete.


Re: gcc tricore porting

2023-06-19 Thread Mikael Pettersson via Gcc
(Note I'm reading the gcc mailing list via the Web archives, which
doesn't let me
create "proper" replies. Oh well.)

On Sun Jun 18 09:58:56 GMT 2023,  wrote:
> Hi, this is my first time with open source development. I worked in
> automotive for 22 years and we (generally) were using tricore series for
> these products. GCC doesn't compile on that platform. I left my work some
> days ago and so I'll have some spare time in the next few months. I would
> like to know how difficult it is to port the tricore platform on gcc and if
> during this process somebody can support me as tutor and... also if the gcc
> team is interested in this item...

https://github.com/volumit has a port of gcc + binutils + newlib + gdb
to Tricore,
and it's not _that_ ancient. I have no idea where it originates from
or how complete
it is, but I do know the gcc-4.9.4 based one builds with some tweaks.

I don't know anything more about it, I'm just a collector of cross-compilers for
obscure / lost / forgotten / abandoned targets.

/Mikael


How to target a processor with very primitive addressing modes?

2024-06-06 Thread Mikael Pettersson via Gcc
I'm working on a GCC backend for an older processor with very
primitive addressing support. The only way to load from or store to
memory is to compute the address into a register, and then use that
register in the load or store instruction. There are no immediate or
register offsets in memory accesses.

So what I need is for GCC to transform any

y = *(r + x);

into

t = r + x;
y = *t;

and similarly for stores. The problem is that however I try to model
that (rewrites in define_expands, patterns in the define_insns, or
extra predicates and constraint letters) GCC goes into reload loops. I
suspect that reloading assumes it can use SP+offset addressing modes
without needing new temps.

My current workaround is to reserve a register as a hidden spill temp,
and invoke code from the insn patterns to change the output templates
to the equivalent of the above when needed, but it's not a good
solution.

Are there any targets in current GCC with similar constraints?

Do you have any other recommendations on how to deal with these
constraints in GCC?

Thanks in advance,

/Mikael

(p.s. please CC me on any replies as I'm not subscribed to this mailing list)


Re: How to target a processor with very primitive addressing modes?

2024-06-08 Thread Mikael Pettersson via Gcc
On Thu, Jun 6, 2024 at 8:59 PM Dimitar Dimitrov  wrote:
> Have you tried defining TARGET_LEGITIMIZE_ADDRESS for your target? From
> a quick search I see that the iq2000 and rx backends are rewriting some
> PLUS expression addresses with insn sequence to calculate the address.

I have partial success.

The key was to define both TARGET_LEGITIMATE_ADDRESS_P and an
addptr3 insn.

I had tried TARGET_LEGITIMATE_ADDRESS_P before, together with various
combinations of TARGET_LEGITIMIZE_ADDRESS and
LEGITIMIZE_RELOAD_ADDRESS, but they all threw gcc into reload loops.

My add3 insn clobbers the CC register. The docs say to define
addptr3 in this case, and that eliminated the reload loops.

The issue now is that the machine cannot perform an add without
clobbering the CC register, so I'll have to hide that somehow. When
emitting the asm code, can one check if the CC register is LIVE-OUT
from the insn? If it isn't I shouldn't have to generate code to
preserve it.

/Mikael


Re: How to target a processor with very primitive addressing modes?

2024-06-18 Thread Mikael Pettersson via Gcc
On Sat, Jun 8, 2024 at 5:11 PM Jeff Law  wrote:
>
>
>
> On 6/8/24 3:32 AM, Mikael Pettersson via Gcc wrote:
> > On Thu, Jun 6, 2024 at 8:59 PM Dimitar Dimitrov  wrote:
> >> Have you tried defining TARGET_LEGITIMIZE_ADDRESS for your target? From
> >> a quick search I see that the iq2000 and rx backends are rewriting some
> >> PLUS expression addresses with insn sequence to calculate the address.
> >
> > I have partial success.
> >
> > The key was to define both TARGET_LEGITIMATE_ADDRESS_P and an
> > addptr3 insn.
> If it doesn't work without TARGET_LEGITIMATE_ADDRESS, then it's wrong.
>
> At the highest level that hook is meant to provide a way for the target
> to adjust addresses to optimize them better.  If you're using it for
> correctness purposes, it's ultimately going to fail in one way or another.

My target is the RCA CDP1802. 8-bit bytes, 16-bit address space,
sixteen 16-bit address registers, one 8-bit accumulator through which
all data movement and arithmetic happens, one 1-bit carry/borrow flag,
and two 4-bit register selector registers, one of which selects which
address register is PC, the other which register is used with the
"stack push" instruction and a few others. It's an odd and very
primitive beast.

My first attempt was just following the gcc internals docs and
extrapolating from other targets, but that failed with reload loops as
soon as it started building libgcc. So I scrapped that and took the
stormy16 port, simplified and tweaked it to the point where it
semantically matched my target, and then changed the rest over to be
CDP1802-specific. This works well enough to build libgcc including
soft-fp.

The main omissions are frame layout which still is for the stormy16,
and target-specific support routines in libgcc for things that are
just too awkward to generate code for.

The code can be found on github, if anyone is interested:

> git clone -b cdp1802 https://github.com/mikpe/binutils-gdb.git
> git clone -b cdp1802 https://github.com/mikpe/gcc.git

The target is cdp1802-unknown-elf , and gcc also wants --with-newlib
--enable-languages=c --disable-libssp.

Pending work includes:
- fixing the frame layout
- support for targeting the CDP1804/05/06 which adds some register
data movement operations that don't need to go via the 8-bit
accumulator
- seeing if I can get rid of addptrhi3
- seeing if I can expose the register selector register X so that gcc
can eliminate redundant assignments to it
- update the simulator to run ELF executables not just Intel HEX files
- run the test suite

/Mikael