Re: proposal to make SIZE_TYPE more flexible

2014-01-28 Thread DJ Delorie

Ping?  Or do I need to repost on the patches list?

http://gcc.gnu.org/ml/gcc/2014-01/msg00130.html


Re: proposal to make SIZE_TYPE more flexible

2014-01-28 Thread DJ Delorie

> Repost on the patches list (with self-contained write-up, rationale for 
> choices made, etc.) at the start of stage 1 for 4.10/5.0,

Ok.

> I suggest (this clearly isn't stage 3 material).

Yup.  Would be nice to back port it to 4.9 later, but... understood.


Re: MSP430 in gcc4.9 ... enable interrupts?

2014-02-14 Thread DJ Delorie

The constructs in the *.md files are for the compiler's internal use
(i.e. there are function attributes that trigger those).  You don't
need compiler support for these opcodes at the user level; the right
way is to implement those builtins as inline assembler in a common
header file:

static inline __attribute__((always_inline))
void __nop()
{
  asm volatile ("NOP");
}

static inline __attribute__((always_inline))
void __eint()
{
  asm volatile ("EINT");
}


Or more simply:

#define __eint() asm("EINT")
#define __nop() asm("NOP")


For opcodes with parameters, you use a more complex form of inline
assembler:

static inline __attribute__((always_inline))
void BIC_SR(const int x)
{
  asm volatile ("BIC.W %0,R2" :: "i" (x));
}


Re: MSP430 in gcc4.9 ... enable interrupts?

2014-02-17 Thread DJ Delorie

> I presume these will be part of the headers for the library
> distributed for msp430 gcc by TI/Redhat?

I can't speak for TI's or Red Hat's plans.  GNU's typical non-custom
embedded runtime is newlib/libgloss, which usually doesn't have that
much in the way of chip-specific headers or library functions.

> is that for the "critical" attribute that exists in the old msp430
> port (which disables interrupts for the duration of the function)?

Yes, for things like that.  They're documented under "Function
Attributes" in the "Extensions to the C Language Family" chapter of
the current GCC manual.


Re: [RL78] Questions about code-generation

2014-03-10 Thread DJ Delorie

> I've managed to build GCC myself so that I could experiment a bit
> but as this is my first foray into compiler internals, I'm
> struggling to work out how things fit together and what affects
> what.

The key thing to know about the RL78 backend, is that it has two
"targets" it uses.  For the first part of the compilation, up until
after reload, the model uses 16 virtual registers (R8 through R15) and
a virtual machine to give gcc an orthogonal model that it can generate
code for.  After reload, there's a "devirtualization" pass in the RL78
backend that maps the virtual model to the real model (R0 through R7),
which means copying values in and out of the real registers according
to which addressing modes are needed.  Then GCC continues optimizing,
which gets rid of most of the unneeded instructions.

The problem you're probably running into is that deciding which real
registers to use for each virtual one is a very tricky task, and the
post-reload optimizers aren't expecing the code to look like what it
does.

> What causes that code to be generated when using a variable instead
> of a fixed memory address?

The use of "volatile" disables many of GCC's optimizations.  I
consider this a bug in GCC, but at the moment it needs to be "fixed"
in the backends on a case-by-case basis.


Re: [RL78] Questions about code-generation

2014-03-10 Thread DJ Delorie

> Ah, that certainly explains a lot.  How exactly would the fixing be 
> done?  Is there an example I could look at for one of the other processors?

No, RL78 is the first that uses this scheme.

> I calculated a week or two ago that we could make a code-saving of 
> around 8% by using near or relative branches and near calls instead of 
> always generating far calls.  I changed rl78-real.md to use near 
> addressing and got about 5%.  That's probably about right.  I tried to 
> generate relative branches too but I'm guessing that the 'length' 
> attribute needs to be set for all instructions to get that working properly.

Or the linker could be taught to optimize branches once it knows the
full displacement, but that can be even trickier to get right.


Re: [RL78] Questions about code-generation

2014-03-10 Thread DJ Delorie

> I'm curious.  Have you tried out other approaches before you decided
> to go with the virtual registers?

Yes.  Getting GCC to understand the "unusual" addressing modes the
RL78 uses was too much for the register allocator to handle.  Even
when the addressing modes are limited to "usual" ones, GCC doesn't
have a good way to do regalloc and reload when there are limits on
what registers you can use in an address expression, and it's worse
when there are dependencies between operands, or limited numbers of
address registers.


Re: Legitimize address after reload

2014-03-14 Thread DJ Delorie

David Guillen  writes:
> In any case I'm not using the restrict variable and I'm assuming
> strict is zero, this is, not checking the hard regsiters themselves.
> This is because any reg is OK for base reg. I'm pretty sure I'm
> behaving similarly to arm, cris or x86 backends.

"strict" doesn't mean which hard register it is, "strict" means whether
or not it's a hard register at all.

If "strict" is true, you must assume any REG which isn't a real hard
register (i.e. REGNO >= FIRST_PSEUDO_REGISTER) does NOT match.


Re: [RL78] Questions about code-generation

2014-03-16 Thread DJ Delorie

> Maybe we should add a target hook/macro to control this to avoid
> duplicated code of 'general_operand' in various places?

Even in the msp430, not all patterns can be safely used with volatile
MEMs (i.e. the macro patterns).  So, not all general_operand's were
replaced.


Re: [RL78] Questions about code-generation

2014-03-16 Thread DJ Delorie

This is similar to what I had to do for msp430 - I made a new
constraint that was what general_operand would have done if it allowed
volatile MEMs, and used that for instructions where a volatile's
volatileness wouldn't be broken.


Re: [RL78] Questions about code-generation

2014-03-21 Thread DJ Delorie

> Is it possible that the virtual pass causes inefficiencies in some
> cases by sticking with r8-r31 when one of the 'normal' registers
> would be better?

That's not a fair question to ask, since the virtual pass can *only*
use r8-r31.  The first bank has to be left alone else the
devirtualizer becomes a few orders of magnitude harder, if not
impossible, to make work correctly.

> In some cases, the normal optimization steps remove a lot, if not all, 
> of the unnecessary register passing, but not always.

I've found that "removing uneeded moves through registers" is
something gcc does poorly in the post-reload optimizers.  I've written
my own on some occasions (for rl78 too).  Perhaps this is a good
starting point to look at?

> much needless copying, which strengthens my suspicion that it's 
> something in the RL78 backend that needs 'tweaking'.

Of course it is, I've said that before I think.  The RL78 uses a
virtual model until reload, then converts each virtual instructions
into multiple real instructions, then optimizes the result.  This is
going to be worse than if the real model had been used throughout
(like arm or x86), but in this case, the real model *can't* be used
throughout, because gcc can't understand it well enough to get through
regalloc and reload.  The RL78 is just to "weird" to be modelled
as-is.

I keep hoping that gcc's own post-reload optimizers would do a better
job, though.  Combine should be able to combine, for example, the "mov
r8,ax; cmp r8,#4" types of insns together.


Re: Testing machine descriptions

2014-03-27 Thread DJ Delorie

I've thought about making a dejagnu testsuite specifically for helping
with new ports, which would mean lots of md-specific tests, but
really, the main testsuite probably covers everything you'd need to
test.  All patches are supposed to be regression tested anyway, which
means running the full dejagnu testsuite before and after your change,
to make sure you didn't break anything.


Re: Testing machine descriptions

2014-03-27 Thread DJ Delorie

The main testsuite doesn't have tests specifically to cover all the md
entries.  What I meant was, I suspect it covers enough plain C test
cases to happen to use all the usual md entries.

Since each target has different md entries (both "which are used" and
"how each is used"), it would be nearly impossible to write an md
testsuite.  If you have target-specific patterns you want to test,
you'd have to add a target-specific testsuite for them.


Re: Testing machine descriptions

2014-03-27 Thread DJ Delorie

Is there some way to get insight into which alternatives get used
during a coverage run?


Re: RL78 sim?

2014-03-31 Thread DJ Delorie

> So far I've been testing with hardware but I'm pretty sure I read 
> somewhere about an RL78 simulator, which would be a useful addition. 
> Does this simulator exist, and if so, how do I run the tests against it?

The simulator is part of the GDB build.

> I tried 'make -k check RUNTESTFLAGS="--target_board=rl78-sim"' but in 
> amongst the errors I see 'ERROR: couldn't load description file for 
> rl78-sim', either it has a different name or I'm missing something on my 
> system (and a quick search didn't seem to find anything but I don't 
> really know what I'm looking for).

You'll need something like this in your local ${DEJAGNU} file:

{ "rl78*-*" } {
set boards_dir "/home/dj/dejagnu/baseboards"
set target_list { rl78-sim }
}

Here's my rl78-sim.exp for dejagnu (it goes in whatever directory you
specified above):

# This is a list of toolchains that are supported on this board.
set_board_info target_install {rl78-elf}

# Load the generic configuration for this board. This will define a basic set
# of routines needed by the tool to communicate with the board.
load_generic_config "sim"

# basic-sim.exp is a basic description for the standard Cygnus simulator.
load_base_board_description "basic-sim"

# "rl78" is the name of the sim subdir.
setup_sim rl78

# No multilib options needed by default.
process_multilib_options ""

# We only support newlib on this target. We assume that all multilib
# options have been specified before we get here.

set_board_info compiler  "[find_gcc]"
set_board_info cflags"[libgloss_include_flags] [newlib_include_flags] -msim"
set_board_info ldflags   "[libgloss_link_flags] [newlib_link_flags]"

# Doesn't pass arguments or signals, can't return results, and doesn't
# do inferiorio.
set_board_info noargs 1
set_board_info gdb,nosignals 1
set_board_info gdb,noresults 1
set_board_info gdb,noinferiorio 1

# Limit the stack size to something real tiny.
set_board_info gcc,stack_size 4096

set_board_info gcc,timeout 300


stack-protection vs alloca vs dwarf2

2014-04-16 Thread DJ Delorie

While debugging some gdb-related FAILs, I discovered that gcc's
-fstack-check option effectively calls alloca() to adjust the stack
pointer.

However, it doesn't mark the stack adjustment as FRAME_RELATED even
when it's setting up the local variables for the function.

In the case of rx-elf, for this testcase, the CFA for the function is
defined in terms of the stack pointer - and thus is incorrect after
the alloca call.

My question is: who's fault is this?  Should alloca() tell the debug
stuff that the stack pointer has changed?  Should it tell it to not
use $sp at all?  Should the debug stuff "just know" that $sp isn't a
valid choice for the CFA?

The testcase from gdb is pretty simple:

  void medium_frame ()
  {
char S [16384];
small_frame ();
  }


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread DJ Delorie

> Presumably the rx back-end and more precisely TARGET_FRAME_POINTER_REQUIRED, 
> which needs to return true if cfun->calls_alloca.

The rx back-end doesn't define TARGET_FRAME_POINTER_REQUIRED, as the
documentation says the compiler handles target-independent reasons why
there needs to be a frame pointer.  But, the default
TARGET_FRAME_POINTER_REQUIRED just returns false - shouldn't it, by
default, check for calls_alloca ?

Also, I added that hook and set it to return true always, and it
didn't fix the bug.  There is a frame pointer (there was before, too),
but there's also a stack adjustment after the pseudo-alloca which the
dwarf2 stuff doesn't know about.  The last stack adjustment it sees is
the rx backend's adjustment to allocate the frame:

_medium_frame:   
pushm   r6-r12   
add #-4, r0, r6  ; marked frame-related (fp = sp - 4)
mov.L   r6, r0   ; marked frame-related (sp = fp)
. . .; stack checking code goes here
add #0xc000, r0  ; not marked frame-related

 <_medium_frame>:
   0:   6e 6c   pushm   r6-r12
   2:   71 06 fcadd #-4, r0, r6
   5:   ef 60   mov.l   r6, r0
   7:

  2e:   72 00 00 c0 add #0xc000, r0, r0

0014 0030  FDE cie= pc=..0043
  DW_CFA_advance_loc4: 2 to 0002
  DW_CFA_def_cfa_offset: 32
  DW_CFA_offset: r12 at cfa-8
  . . .
  DW_CFA_offset: r6 at cfa-32
  DW_CFA_advance_loc4: 3 to 0005
  DW_CFA_def_cfa: r6 ofs 36
  DW_CFA_advance_loc4: 2 to 0007
  DW_CFA_def_cfa_register: r0
  ( that's it for debug info )


Perhaps the stack-check code should set FRAME_RELATED on any stack
adjustment insn?


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread DJ Delorie

> I gather that r0 is the stack pointer and r6 the frame pointer?

Yes.

> > 0014 0030  FDE cie= pc=..0043
> >   DW_CFA_advance_loc4: 2 to 0002
> >   DW_CFA_def_cfa_offset: 32
> >   DW_CFA_offset: r12 at cfa-8
> >   . . .
> >   DW_CFA_offset: r6 at cfa-32
> >   DW_CFA_advance_loc4: 3 to 0005
> >   DW_CFA_def_cfa: r6 ofs 36
> >   DW_CFA_advance_loc4: 2 to 0007
> >   DW_CFA_def_cfa_register: r0
> >   ( that's it for debug info )
> 
> If so, the above DW_CFA_def_cfa_register doesn't make sense, it should be r6 
> once the frame is established.  What does the CIE contain exactly?


 0010  CIE
  Version:   3
  Augmentation:  ""
  Code alignment factor: 1
  Data alignment factor: -4
  Return address column: 17

  DW_CFA_def_cfa: r0 ofs 4
  DW_CFA_offset: r17 at cfa-4
  DW_CFA_nop
  DW_CFA_nop

0014 0030  FDE cie= pc=..0043
  DW_CFA_advance_loc4: 2 to 0002
  DW_CFA_def_cfa_offset: 32
  DW_CFA_offset: r12 at cfa-8
  DW_CFA_offset: r11 at cfa-12
  DW_CFA_offset: r10 at cfa-16
  DW_CFA_offset: r9 at cfa-20
  DW_CFA_offset: r8 at cfa-24
  DW_CFA_offset: r7 at cfa-28
  DW_CFA_offset: r6 at cfa-32
  DW_CFA_advance_loc4: 3 to 0005
  DW_CFA_def_cfa: r6 ofs 36
  DW_CFA_advance_loc4: 2 to 0007
  DW_CFA_def_cfa_register: r0


> > Perhaps the stack-check code should set FRAME_RELATED on any stack
> > adjustment insn?
> 
> No, the design is that stack checking or alloca force the use of the frame 
> pointer, which thus becomes the CFA register, which means that subsequent 
> stack adjustments are irrelevant for the CFI.

Does the backend have to *not* mark further changes to the stack
pointer in the prologue as frame related, if the function calls
alloca?  This is the RL expand_prologue() is emitting:

(insn/f 42 5 43 2 (parallel [
(set/f (reg/f:SI 0 r0)
(minus:SI (reg/f:SI 0 r0)
(const_int 28 [0x1c])))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 4 [0x4])) [0  S4 A8])
(reg:SI 12 r12))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 8 [0x8])) [0  S4 A8])
(reg:SI 11 r11))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 12 [0xc])) [0  S4 A8])
(reg:SI 10 r10))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 16 [0x10])) [0  S4 A8])
(reg:SI 9 r9))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 20 [0x14])) [0  S4 A8])
(reg:SI 8 r8))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 24 [0x18])) [0  S4 A8])
(reg/f:SI 7 r7))
(set/f (mem:SI (minus:SI (reg/f:SI 0 r0)
(const_int 28 [0x1c])) [0  S4 A8])
(reg/f:SI 6 r6))
]) dj.c:2 -1
 (nil))

(insn/f 43 42 44 2 (parallel [
(set (reg/f:SI 6 r6)
(plus:SI (reg/f:SI 0 r0)
(const_int -4 [0xfffc])))
(clobber (reg:CC 16 cc))
]) dj.c:2 -1
 (nil))

(insn/f 44 43 45 2 (set (reg/f:SI 0 r0)
(reg/f:SI 6 r6)) dj.c:2 -1
 (nil))


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread DJ Delorie

> The "mov.L r6, r0" instruction must never be marked as frame-related, for any 
> function.

Is this documented somewhere?


Re: stack-protection vs alloca vs dwarf2

2014-04-17 Thread DJ Delorie

> The "mov.L r6, r0" instruction must never be marked as frame-related, for any 
> function.

Also, is that rule true if we *don't* have a frame pointer?  That is,
when we add a constant to the stack to allocate the frame, should that
function be marked as frame-related?  Or is it just the fp->sp move
(or potentially an add, if there's outgoing args) that shouldn't be
marked?


question about GTY macro

2014-05-07 Thread DJ Delorie

Given this in tree.h:

  struct int_n_trees_t {
tree signed_type;
tree unsigned_type;
  };

  extern struct int_n_trees_t int_n_trees[NUM_INT_N_ENTS];

And this in tree.c:

  struct int_n_trees_t int_n_trees [NUM_INT_N_ENTS];

What is the right way to mark these for garbage collection?

I can't seem to get int_n_trees[] to show up in any of the gc-related
generated files.

I need the int_n_trees[] trees to be locked into memory, but I see
signs that they're being reclaimed instead.


Re: question about GTY macro

2014-05-09 Thread DJ Delorie

> Likewise.  See how global_trees is marked for example.  But likely
> you forgot to mark struct int_n_trees_t to be considered for GC.

I did remember int_n_trees_t, but it seems to be working now, so who
knows what I was doing wrong :-P

Thanks!


Re: reverse bitfield patch

2014-07-01 Thread DJ Delorie

Revisiting an old thread, as I still want to get this feature in...

https://gcc.gnu.org/ml/gcc/2012-10/msg00099.html

> >> Why do you need to change varasm.c at all?  The hunks seem to be
> >> completely separate of the attribute.
> >
> > Because static constructors have fields in the original order, not the
> > reversed order.  Otherwise code like this is miscompiled:
> 
> Err - the struct also has fields in the original order - only the bit 
> positions
> of the fields are different because of the layouting option.

The order of the field decls in the type (stor-layout.c) is not
changed, only the bit position information.  The order here *can't* be
changed, because the C language assumes that parameters, initializers,
etc are presented in the same order as the original declaration,
regardless of the target-specific layout.

When the program includes an initializer:

> > struct foo a = { 1, 2, 3 };

The order of 1, 2, and 3 need to correspond to the order of the
bitfields in 'a', so we can change neither the order of the bitfields
in 'a' nor the order of constructor fields.

However, when we stream the initializer out to the .S file, we need to
pack the bitfields in the right sequence to generate the right bit
patterns in the final output image.  The code in varasm.c exists to
make sure that the initializers for bitfields are written/packed in
the correct order, to correspond to the bitfield positions.  I.e.  the
1,2,3 initializer needs to be written to the .S file as either 0x0123
or 0x3210 depending on the bit positions.

In neither case do we change the order of the fields in the type
itself, i.e. the array/chain order.

> And you expect no other code looks at fields of a structure and its
> initializer?  It's bad to keep this not in-sync.  Thus I don't think it's
> viable to re-order fields just because bit allocation is reversed.

The fields are in sync.  The varasm.c change sorts the elements as
they're being output into the byte stream in the .S, it doesn't sort
the field definitions themselves.

> > + /* If the bitfield-order attribute has been used on this
> > +structure, the fields might not be in bit-order.  In that
> > +case, we need a separate representative for each
> > +field.  */
> > The typical use-case for this feature is memory-mapped hardware, where
> > pessimum access is preferred anyway.
> 
> I doubt that, looking at constraints for strict volatile bitfields.

The code that handles representatives requires (via an assert, IIRC)
that the bit offsets within a representative be in ascending order.
I.e. gcc ICEs if I don't bypass this.  In the case of volatile
bitfields, which would be the typical use case for a reversed
bitfield, the access mode is going to match the type size regardless,
so performance is not changed by this patch.


Re: reverse bitfield patch

2014-07-07 Thread DJ Delorie

> Ok, but as we are dealing exclusively with bitfields there is
> already output_constructor_bitfield which uses an intermediate
> state to "pack" bits into units that are then emitted.  It shouldn't
> be hard to change that to make it pack into the appropriate bits
> instead.

That assumes that the output unit is only emitted once per string of
bitfields.  If the total amount of data to output is larger than the
unit size, then the units themselves need to be output in the other
order also.

> Note that code expects that representatives are byte-aligned so better
> would be to not assign representatives or make the code work with
> the swapped layout (I see no reason why that shouldn't work - maybe
> it works doing before swapping the layout)?

I'm OK with not assigning them, but I couldn't figure out from the
code what they were for.

> I'm still not happy about the idea in general (why is this a bitfield
> exclusive thing?  If a piece of HW is big/little-endian then even
> regular fields would have that property.

A bi-endian MCU with memory-mapped peripherals needs this to properly
and portably describe the fields within the peripheral's registers.
Without this patch, there's no way (short of two independent
definitions) of assigning a name to, for example, the LSB of such a
device's registers.

> Your patch comes with no testcase - testcases should cover all
> attribute variants, multiple bitfield (group) sizes and mixed
> initializations / reads / writes and be best execute testcases.

I wrote testcases, perhaps I just forgot to attach them.


Re: m32c-*-* Build Issue (Multilib?)

2014-07-17 Thread DJ Delorie

I just tried a 4.9.1 build and got this error:

configure:4222: checking whether to use setjmp/longjmp exceptions
configure:: /greed/dj/gnu/gcc/m32c-elf/gcc-4_9-branch/./gcc/xgcc 
-B/greed/dj/gnu/gcc/m32c-elf/gcc-4_9-branch/./gcc/ 
-B/greed/dj/m32c/install/m32c-elf/bin/ -B/greed/dj/m32c/install/m32c-elf/lib/ 
-isystem /greed/dj/m32c/install/m32c-elf/include -isystem 
/greed/dj/m32c/install/m32c-elf/sys-include  -mcpu=m32cm -c --save-temps 
-fexceptions  conftest.c >&5
conftest.c: In function 'foo':
conftest.c:19:1: error: insn does not satisfy its constraints:
 }
 ^
(insn 52 38 23 (set (reg:SI 2 r1 [29])
(reg:SI 4 a0)) 99 {movsi_24}
 (nil))

conftest.c:19:1: internal compiler error: in final_scan_insn, at final.c:2891
0x7a56a8 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
/greed/dj/gnu/gcc/svn/gcc-4_9-branch/gcc/rtl-error.c:109
0x7a56cf _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/greed/dj/gnu/gcc/svn/gcc-4_9-branch/gcc/rtl-error.c:120
0x6256c9 final_scan_insn(rtx_def*, _IO_FILE*, int, int, int*)
/greed/dj/gnu/gcc/svn/gcc-4_9-branch/gcc/final.c:2891
0x6258ef final(rtx_def*, _IO_FILE*, int)
/greed/dj/gnu/gcc/svn/gcc-4_9-branch/gcc/final.c:2023
0x626035 rest_of_handle_final
/greed/dj/gnu/gcc/svn/gcc-4_9-branch/gcc/final.c:4427
0x626035 execute
/greed/dj/gnu/gcc/svn/gcc-4_9-branch/gcc/final.c:4502


Re: m32c-*-* Build Issue (Multilib?)

2014-07-17 Thread DJ Delorie

> We see other failures in the log because newlib/targ-include
> isn't created.  The rtems build include path includes that and
> needs it but it isn't created before libgcc is built. That isn't a
> problem on other targets. I don't see anything odd in the top
> configurery magic for m32c which could cause this but I could
> easily be missing something.

If you're building in separate trees, you need to build gcc-host, then
newlib, then gcc-target.

If you're building in a combined tree, I don't know.


Re: m32c-*-* Build Issue (Multilib?)

2014-07-17 Thread DJ Delorie

> What's the next step?

Someone finds time and desire to debug it ;-)


push_rounding vs memcpy vs stack_pointer_delta

2014-08-28 Thread DJ Delorie

The m32c-elf with -mcpu=m32c has a word-aligned stack and uses pushes
for arguments (i.e. not accumulate_outgoing_args).  In this test case,
one of the arguments is memcpy'd into place, and an assert fails:

typedef struct {
  int a, b, c, d, e, f, g, h;
} foo;
int x;
void
dj (int a, int b, foo c)
{
  dj2 (x, a, b, c);
}

  if (pass == 0)
{
  . . .
}
  else
{
  normal_call_insns = insns;

  /* Verify that we've deallocated all the stack we used.  */
  gcc_assert ((flags & ECF_NORETURN)
  || (old_stack_allocated
  == stack_pointer_delta - pending_stack_adjust));
}

After much debugging, it turns out that the argument that's memcpy'd
to stack doesn't adjust stack_pointer_delta the same way that the
other arguments do (i.e. push_block and push_args don't adjust it
consistently, or something like that.)

I came up with this patch:

Index: expr.c
===
--- expr.c  (revision 214599)
+++ expr.c  (working copy)
@@ -4234,12 +4234,16 @@ emit_push_insn (rtx x, enum machine_mode
  /* Get the address of the stack space.
 In this case, we do not deal with EXTRA separately.
 A single stack adjust will do.  */
  if (! args_addr)
{
  temp = push_block (size, extra, where_pad == downward);
+#ifdef PUSH_ROUNDING
+ if (CONST_INT_P (size))
+   stack_pointer_delta += INTVAL (size) + extra;
+#endif
  extra = 0;
}
  else if (CONST_INT_P (args_so_far))
temp = memory_address (BLKmode,
   plus_constant (Pmode, args_addr,
  skip + INTVAL (args_so_far)));


But builds of libstdc++v3 demonstrate that sometimes
stack_pointer_delta *is* adjusted consistently, thus there must be
some more complex logic for determining when the extra adjustment
(i.e. my patch) should be made, or it's in the wrong place.

So... could someone more familiar with this code enlighten me on the
actual rules for what stack_pointer_delta means and when it gets
adjusted, and where (in theory) a memcpy'd argument on a push_args
target should update it (if at all)?

Thanks!
DJ


Re: libgcc - SJLJ probe failing on head on h8300 & m32c

2014-11-05 Thread DJ Delorie

Last time you mentioned this, I asked what the contents of that
config.log were...


pointer math vs named address spaces

2014-12-03 Thread DJ Delorie

If a target (rl78-elf in my case) has a named address space larger
than the generic address space (__far in my case), why is pointer math
in that named address space still truncated to sizetype?

N1275 recognizes that named address spaces might be a different size
than the generic address space, but I didn't see anything that
required such truncation.


volatile char __far * ptr1;
volatile char __far * ptr2;
uint32_t sival;

foo()
{
  ptr2 = ptr1 + sival;
}

foo ()
{
  volatile  char * ptr2.5;
  sizetype D.2252;
  long unsigned int sival.4;
  volatile  char * ptr1.3;
  sizetype _3;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  ptr1.3_1 = ptr1;
  sival.4_2 = sival;
  _3 = (sizetype) sival.4_2;<--- why this truncation?
  ptr2.5_4 = ptr1.3_1 + _3;
  ptr2 = ptr2.5_4;
  return;
;;succ:   EXIT

}


Re: pointer math vs named address spaces

2014-12-03 Thread DJ Delorie

> However, pointer subtraction still returns ptrdiff_t, and sizeof still 
> returns size_t,

Why?


Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie

Matt Godbolt  writes:
> GCC's code generation uses a "load; add; store" for volatiles, instead
> of a single "add 1, [metric]".

GCC doesn't know if a target's load/add/store patterns are
volatile-safe, so it must avoid them.  There are a few targets that have
been audited for volatile-safe-ness such that gcc *can* use the combined
load/add/store when the backend says it's OK.  x86 is not yet one of
those targets.

Also, note that the standard says the physical target must do the same
operations that the "model" target does, but it does not require that
these operations be in separate opcodes.  A single opcode that performs
the correct operations in the correct order complies with the standard;
but you have to tell gcc which opcodes comply.


Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie

> What is involved with the auditing?

Each pattern that (directly or indirectly) uses general_operand,
memory_operand, or nonimmediate_operand needs to be checked to see if
it's volatile-safe.  If so, you need to change the predicate to
something that explicitly accepts volatiles.

There's been talk about adding direct support for a "volatile-clean"
flag that avoids this for targets where you know it's correct, which
bypasses the volatile check in those functions, but it hasn't happened
yet.


Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie

> One question: do you have an example of a non-volatile-safe machine so
> I can get a feel for the problems one might encounter?  At best I can
> imagine a machine that optimizes "add 0, [mem]" to avoid the
> read/write, but I'm not aware of such an ISA.

For example, the MSP430 backend uses a macro for movsi, addsipsi3,
subpsi3, and a few others, which aren't volatile-safe.  Look for
"general_operand" vs "msp_general_operand".



Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie

> I looked in the documentation and didn’t see this described.

AFAIK it's not documented.  Only recently was it agreed (and even
then, reluctantly) that the ISO spec could be met by such opcodes.


Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie

> To try to generalize from that: it looks like the operating
> principle is that an insn that expands into multiple references to a
> given operand isn’t volatile-safe, but one where there is only a
> single reference is safe?

No, if the expanded list of insns does "what the standard says, no
more, no less" as far as memory accesses go, it's OK.  Many of the MSP
macros do not access memory in a volatile-safe way.  Some do.

If you have a single opcode that isn't volatile-safe (for example, a
string operation that's interruptable and restartable), that wouldn't
be OK despite being a single insn.

So it's kinda mechanical, but not always.


Re: volatile access optimization (C++ / x86_64)

2015-01-05 Thread DJ Delorie

> Ok, but the converse — if the general_operand is accessed by more
> than one instruction, it is not safe — is correct, right?

In general, I'd agree, but the ISO spec talks about "sequence points"
and there are times when you *can* access a volatile multiple times as
long as the state is correct at the sequence point.  GCC won't, for
example, combine insns if it doesn't know if the combined insn follows
the sequence point rules correctly.  This is in 5.1.2.3 in the C99
spec but that caveat mostly applies to non-memory-mapped volatiles;
memory-mapped ones are typically more strictly confined (6.7.3.6)
(which is the origin of the -fstrict-volatile-bitfields patch).

So, for example, if you had a volatile on the stack and a special
stack-relative insn to modify it, you would "know" it would be safe to
do so even if it doesn't meed 6.7.3.6.  Or if you used atomics to
guard a multi-access macro to make it volatile-safe.


Re: Rename C files to .c in GCC source

2015-01-30 Thread DJ Delorie

pins...@gmail.com writes:
> No because they are c++ code so capital C is correct. 

However, we should avoid relying on case-sensitive file systems
(Windows) and use .cc or .cxx for C++ files ("+" is not a valid file
name character on Windows, so we can't use .c++).


Re: Rename C files to .c in GCC source

2015-01-31 Thread DJ Delorie

> Aren't current Windows file systems case-preserving?  Then they
> shouldn't have no problems with .C files.

They are case preserving, but not case sensitive.  A wildcard search
for *.c will match foo.C and bar.c, and foo.c can be opened as FOO.C.


building against a temporary install dir?

2015-02-03 Thread DJ Delorie

So here's what I'm trying to do...  I want to build gcc, binutils, and
newlib, run tests, and IF the tests pass, THEN install them all.
However, gcc needs an installed newlib to build it's libraries.

I tried installing newlib into $DESTDIR$PREFIX but gcc ignores
$DESTDIR during the compile.

Any ideas on how to do this, short of building and installing
everything (or at least gcc and newlib) twice?


Re: Newlib/Cygwin now under GIT

2015-03-10 Thread DJ Delorie

> This is a common problem.  I guess newlib/cygwin got the oldest set
> and, afaik, the GCC toplevel stuff is kind of the master.  It would
> be nice if we had some automatism in place to keep all former src
> repos in sync.

There was never any agreement on who the "master" was for toplevel
sources - no repo was willing to give up control to the other, so no
automatic mirroring was ever done, unlike the libiberty/include
mirror, where src agreed to let gcc be the master.

Also, for the record, I do not wish to, nor do I intend to, provide
any automated merging services for git repos.  I don't like git and
I'd rather not use it if I don't have to.


s390: SImode pointers vs LR

2015-05-29 Thread DJ Delorie

In config/s390/s390.c we accept addresses that are SImode:

  if (!REG_P (base)
  || (GET_MODE (base) != SImode
  && GET_MODE (base) != Pmode))
return false;

However, there doesn't seem to be anything in the s390's opcodes that
masks the top half of address registers in 64-bit mode, the SImode
convention seems to just be a convention for addresses in the first
4Gb.

So... what happens if gcc uses a subreg to load the lower half of a
register (via LR), leaving the upper half with random bits in it, then
uses that register as an address?  I could see no code that checked
for this, and I have a fairly large and ungainly test case that says
it breaks :-(

My local solution was to just disallow "SImode" as an address in
s390_decompose_address, which forces gcc to do an explicit SI->DI
conversion to clear the upper bits, and it seems to work, but I wonder
if it's the ideal solution...


dwarf DW_AT_decl_name: system headers vs source files?

2015-06-19 Thread DJ Delorie

Consider:

# 1 "dj.c"
# 1 "dj.h" 1 3
int dj(int x);
# 2 "dj.c" 2

int dj(int x)
{
}

If you compile with -g and look at the dwarf output, you see:

 <1><2d>: Abbrev Number: 2 (DW_TAG_subprogram)
<2e>   DW_AT_external: 1
<2e>   DW_AT_name: dj
<31>   DW_AT_decl_file   : 2
<32>   DW_AT_decl_line   : 1

 The File Name Table:
  Entry Dir TimeSizeName
  1 0   0   0   dj.c
  2 0   0   0   dj.h


Note that the DW_AT_decl_file refers to "dj.h" and not "dj.c".  If you
remove the "3" from the '# 1 "dj.h" 1 3' line, the DW_AT_decl_file
instead refers to "dj.c".  It's been this way for many releases.

Is this intentional?

If so, what is the rationalization for it?


rl78 vs cse vs memory_address_addr_space

2015-07-01 Thread DJ Delorie

In this bit of code in explow.c:

  /* By passing constant addresses through registers
 we get a chance to cse them.  */
  if (! cse_not_expected && CONSTANT_P (x) && CONSTANT_ADDRESS_P (x))
x = force_reg (address_mode, x);

On the rl78 it results in code that's a bit too complex for later
passes to be optimized fully.  Is there any way to indicate that the
above force_reg() is bad for a particular target?


Re: rl78 vs cse vs memory_address_addr_space

2015-07-06 Thread DJ Delorie

Given a test case like this:

typedef struct {
   unsigned char no0 :1;
   unsigned char no1 :1;
   unsigned char no2 :1;
   unsigned char no3 :1;
   unsigned char no4 :1;
   unsigned char no5 :1;
   unsigned char no6 :1;
   unsigned char no7 :1;
} __BITS8;
#define SFR0_bit (*(volatile union __BITS9 *)0x0)
#define SFREN SFR0_bit.no4

foo() {
   SFREN = 1U;
   SFREN = 0U;
}


(i.e. any code that sets/clears one bit in a volatile memory-mapped
area, which the rl78 has instructions for)

Before:

(insn 5 2 7 2 (set (reg/f:HI 43)
(const_int 240 [0xf0])) test.c:24 7 {*movhi_virt}
 (nil))
(insn 7 5 8 2 (set (reg:QI 45 [ MEM[(volatile union un_per0 *)240B].BIT.no4 ])
(mem/v/j:QI (reg/f:HI 43) [0 MEM[(volatile union un_per0 
*)240B].BIT.no4+0 S1 A16])) test.c:24 5 {movqi_virt}
 (nil))
(insn 8 7 9 2 (set (reg:QI 46)
(ior:QI (reg:QI 45 [ MEM[(volatile union un_per0 *)240B].BIT.no4 ])
(const_int 16 [0x10]))) test.c:24 19 {*iorqi3_virt}
 (expr_list:REG_DEAD (reg:QI 45 [ MEM[(volatile union un_per0 
*)240B].BIT.no4 ])
(nil)))
(insn 9 8 12 2 (set (mem/v/j:QI (reg/f:HI 43) [0 MEM[(volatile union un_per0 
*)240B].BIT.no4+0 S1 A16])
(reg:QI 46)) test.c:24 5 {movqi_virt}
 (expr_list:REG_DEAD (reg:QI 46)
(nil)))
(insn 12 9 13 2 (set (reg:QI 49 [ MEM[(volatile union un_per0 *)240B].BIT.no4 ])
(mem/v/j:QI (reg/f:HI 43) [0 MEM[(volatile union un_per0 
*)240B].BIT.no4+0 S1 A16])) test.c:26 5 {movqi_virt}
 (nil))
(insn 13 12 14 2 (set (reg:QI 50)
(and:QI (reg:QI 49 [ MEM[(volatile union un_per0 *)240B].BIT.no4 ])
(const_int -17 [0xffef]))) test.c:26 18 {*andqi3_virt}
 (expr_list:REG_DEAD (reg:QI 49 [ MEM[(volatile union un_per0 
*)240B].BIT.no4 ])
(nil)))
(insn 14 13 0 2 (set (mem/v/j:QI (reg/f:HI 43) [0 MEM[(volatile union un_per0 
*)240B].BIT.no4+0 S1 A16])
(reg:QI 50)) test.c:26 5 {movqi_virt}
 (expr_list:REG_DEAD (reg:QI 50)
(expr_list:REG_DEAD (reg/f:HI 43)
(nil

Combine gets as far as this:

Trying 5 -> 9:
Failed to match this instruction:
(parallel [
(set (mem/v/j:QI (const_int 240 [0xf0]) [0 MEM[(volatile union un_per0 
*)240B].BIT.no4+0 S1 A16])
(ior:QI (mem/v/j:QI (const_int 240 [0xf0]) [0 MEM[(volatile union 
un_per0 *)240B].BIT.no4+0 S1 A16])
(const_int 16 [0x10])))
(set (reg/f:HI 43)
(const_int 240 [0xf0]))
])

(the set is left behind because it's used for the second assignment)

Both of those insns in the parallel are valid rl78 insns.  I tried
adding that parallel as a define-and-split but combine doesn't split
it at the point where it inserts it, so it doesn't work right.  If it
reduced those four instructions to the two in the parallel, but
without the parallel, it would probably work too.

We end up with code like this:

movwr8, #240 ; 5*movhi_real/4   [length 
= 4]
movwax, r8   ; 19   *movhi_real/5   [length 
= 4]
movwhl, ax   ; 21   *movhi_real/6   [length 
= 4]
set1[hl].4   ; 9*iorqi3_real/1  [length 
= 4]
clr1[hl].4   ; 14   *andqi3_real/1  [length 
= 4]

but what we want is this:

set1!240.4   ; 9*iorqi3_real/1  [length 
= 4]
clr1!240.4   ; 14   *andqi3_real/1  [length 
= 4]

( !240 means (mem (const_int 240)) )

(if there's only one such operation in a function, it combines
properly, likely because the address is not needed after the insn it
can combine, unlike the parallel above)

The common addresses are separated at least before lowering to RTL; as the 
initial
expansion has:

;; MEM[(volatile union un_per0 *)240B].BIT.no4 ={v} 1;

(insn 5 4 7 (set (reg/f:HI 43)
(const_int 240 [0xf0])) test.c:24 -1
 (nil))

(insn 7 5 8 (set (reg:QI 45)
(mem/v/j:QI (reg/f:HI 43) [0 MEM[(volatile union un_per0 
*)240B].BIT.no4+0 S1 A16])) test.c:24 -1
 (nil))

(insn 8 7 9 (set (reg:QI 46)
(ior:QI (reg:QI 45)
(const_int 16 [0x10]))) test.c:24 -1
 (nil))

(insn 9 8 0 (set (mem/v/j:QI (reg/f:HI 43) [0 MEM[(volatile union un_per0 
*)240B].BIT.no4+0 S1 A16])
(reg:QI 46)) test.c:24 -1
 (nil))


Yes, I know gcc doesn't like combining volatile accesses into one
insn, but the rl78 backend (my copy at least) has predicates that
allow it, because it's safe on rl78.

Also, if I take out the "volatile" yet put some sort of barrier (like
a volatile asm) between the two assignments, it still fails, in the
same manner.


Re: rl78 vs cse vs memory_address_addr_space

2015-07-06 Thread DJ Delorie

> Did you try just a define_split instead?  Ugly, but it should work I think.

It doesn't seem to be able to match a define_split :-(


s390: larl for Simode on 64-bit

2015-07-08 Thread DJ Delorie

Is there any reason that LARL can't be used to load a 32-bit symbolic
value, in 64-bit mode?  On TPF (64-bit) the app has the option of
being loaded in the first 4Gb so that all symbols are also valid
32-bit addresses, for backward compatibility.  (and if not, the linker
would complain)

Index: s390.md
===
--- s390.md (revision 225579)
+++ s390.md (working copy)
@@ -1845,13 +1845,13 @@
 emit_symbolic_move (operands);
 })
 
 (define_insn "*movsi_larl"
   [(set (match_operand:SI 0 "register_operand" "=d")
 (match_operand:SI 1 "larl_operand" "X"))]
-  "!TARGET_64BIT && TARGET_CPU_ZARCH
+  "TARGET_CPU_ZARCH
&& !FP_REG_P (operands[0])"
   "larl\t%0,%1"
[(set_attr "op_type" "RIL")
 (set_attr "type""larl")
 (set_attr "z10prop" "z10_fwd_A1")])
 


Re: s390: larl for Simode on 64-bit

2015-07-08 Thread DJ Delorie

In the TPF case, the software has to explicitly mark such pointers as
SImode (such things happen only when structures that contain addresses
can't change size, for backwards compatibility reasons[1]):

int * __attribute__((mode(SImode))) ptr;

  ptr = &some_var;

so I wouldn't consider this the "default" case for those apps, just
*a* case that needs to be handled "well enough", and the user is
already telling the compiler that they assume those addresses are
32-bit (that either the whole app, or at least the part with that
object, will be linked below 4Gb).

The majority of the addresses are handled as 64-bit.


[1] /me refrains from commenting on the worth of such practices, just
that they exist and need to be (and have been) supported.


Re: s390: larl for Simode on 64-bit

2015-07-08 Thread DJ Delorie

> So in effect, we have two pointer sizes, 64 being the default, but
> we can also get a 32 bit pointer via the syntax above?  Wow, I'm
> surprised that works.

Yup, been that way for many years.

> And the only time we'd be able to use larl is a dereference of a
> pointer declared with the syntax above.  Right

larl would be used to load the address of an object to *initialize*
such a pointer, but yes.  Regular pointers still use larl but as a
DImode operation.  I.e. larl will always load a 64-bit value into a
register, even if gcc will only use the 32 LSBs.

> OK for the trunk with a simple testcase.  I think you can just scan
> the assembler output for the larl instruction.

Will do, but it's part of a bigger patch.  I just wanted to make sure
there wasn't some side-effect of larl that precluded this use.


Re: Question about "instruction merge" pass when optimizing for size

2015-08-19 Thread DJ Delorie

I've seen this on other targets too, sometimes so bad I write a quick
target-specific "stupid move optimizer" pass to clean it up.

A generic pass would be much harder, but very useful.


Re: Offer of help with move to git

2015-08-23 Thread DJ Delorie

> In the mean time, I'm enclosing a contributor map that will need to be
> filled in whoever does the conversion.  The right sides should become
> full names and preferred email addresses.

This information should be gleanable from the Changelog commits...
do you have a script to scan those?


Re: Repository for the conversion machinery

2015-08-27 Thread DJ Delorie

Hmmm... I use two email addresses for commits, depending on which target
they're for, i.e.:

$ grep DJ MAINTAINERS 
m32c port       DJ Delorie  
DJGPP       DJ Delorie  

Most of the DJGPP stuff was long ago but I wonder how the conversion
would handle this?


Re: Repository for the conversion machinery

2015-08-27 Thread DJ Delorie

> If you want your commits to be attributed to two different addresses
> in the git conversion, you need to tell me how to specify two
> different selection sets so I can write assign statements and two
> trivial "authors read" commands affecting them only.
>
> assuming that the names m32c and djgpp have been properly bound.

Since I have no idea what you mean by "properly bound", or even where
these names come from, I don't know how to define rules based on them.

The only realiable way I can think of is to look at which address I
used in the ChangeLog entry that is part of each commit.  After all,
that's what ChangeLog entries are for.


reload question about unmet constraints

2015-09-01 Thread DJ Delorie

Given this test case for rl78-elf:

extern __far int a, b;
void ffr (int x)
{
  a = b + x;
}

I'm trying to use this patch:

Index: gcc/config/rl78/rl78-virt.md
===
--- gcc/config/rl78/rl78-virt.md  (revision 227360)
+++ gcc/config/rl78/rl78-virt.md(working copy)
@@ -92,15 +92,15 @@
]
   "rl78_virt_insns_ok ()"
   "v.inc\t%0, %1, %2"
 )
 
 (define_insn "*add3_virt"
-  [(set (match_operand:QHI   0 "rl78_nonfar_nonimm_operand" "=vY,S")
-   (plus:QHI (match_operand:QHI 1 "rl78_nonfar_operand" "viY,0")
- (match_operand:QHI 2 "rl78_general_operand" "vim,i")))
+  [(set (match_operand:QHI   0 "rl78_nonimmediate_operand" "=vY,S,Wfr")
+   (plus:QHI (match_operand:QHI 1 "rl78_general_operand" "viY,0,0")
+ (match_operand:QHI 2 "rl78_general_operand" "vim,i,vi")))
]
   "rl78_virt_insns_ok ()"
   "v.add\t%0, %1, %2"
 )
 
 (define_insn "*sub3_virt"


To allow the rl78 port to generate the "Wfr/0/r" case (alternative 3).
(Wfr = far MEM, v = virtual regs).

I expected gcc to see that the operation doesn't meet the constraints,
and move operands into registers to make it work (alternative 1,
"v/v/v").

Instead, it just complains and dies.

dj.c:42:1: error: insn does not satisfy its constraints:
 }
 ^
(insn 10 15 13 2 (set (mem/c:HI (reg:SI 8 r8) [1 a+0 S2 A16 AS2])
(plus:HI (mem/c:HI (plus:HI (reg/f:HI 32 sp)
(const_int 4 [0x4])) [1 x+0 S2 A16])
(mem/c:HI (symbol_ref:SI ("b") ) [1 b+0 
S2 A16 AS2]))) dj.c:41 13 {*addhi3_virt}
 (nil))
dj.c:42:1: internal compiler error: in extract_constrain_insn, at recog.c:2200


Reloads for insn # 10
Reload 0: reload_in (SI) = (symbol_ref:SI ("a")  )
V_REGS, RELOAD_FOR_INPUT (opnum = 0), inc by 2
reload_in_reg: (symbol_ref:SI ("a")  )
reload_reg_rtx: (reg:SI 8 r8)
Reload 1: reload_in (HI) = (mem/c:HI (plus:HI (reg/f:HI 32 sp)
(const_int 4 [0x4])) [2 
x+0 S2 A16])
reload_out (HI) = (mem/c:HI (plus:HI (reg/f:HI 32 sp)
(const_int 4 [0x4])) [2 
x+0 S2 A16])
V_REGS, RELOAD_OTHER (opnum = 1), optional
reload_in_reg: (mem/c:HI (plus:HI (reg/f:HI 32 sp)
(const_int 4 [0x4])) [2 
x+0 S2 A16])
reload_out_reg: (mem/c:HI (plus:HI (reg/f:HI 32 sp)
(const_int 4 [0x4])) [2 
x+0 S2 A16])
Reload 2: reload_in (HI) = (mem/c:HI (symbol_ref:SI ("b")  ) [2 b+0 S2 A16 AS2])
V_REGS, RELOAD_FOR_INPUT (opnum = 2), optional
reload_in_reg: (mem/c:HI (symbol_ref:SI ("b")  ) [2 b+0 S2 A16 AS2])


So this is where I've been banging my head against the
sources... where is the magic that tells gcc to try to copy everything
into registers to meet the constraints?

Note: expand is ok, the initial add insn is:

(insn 9 3 10 2 (set (reg:HI 48 [ D.1375 ])
(plus:HI (reg/v:HI 45 [ x ])
(mem/c:HI (symbol_ref:SI ("b")  ) [2 b+0 
S2 A16 AS2]))) dj.c:43 -1
 (nil))

and just before reload:

(insn 10 9 0 2 (set (mem/c:HI (symbol_ref:SI ("a")  ) [2 a+0 S2 A16 AS2])
(plus:HI (mem/c:HI (reg/f:HI 33 ap) [2 x+0 S2 A16])
(mem/c:HI (symbol_ref:SI ("b")  ) [2 b+0 
S2 A16 AS2]))) dj.c:43 13 {*addhi3_virt}
 (nil))


Re: reload question about unmet constraints

2015-09-01 Thread DJ Delorie

> It did match the first alternative (alternative 0), but it matched the
> constraints Y/Y/m.

It shouldn't match Y as those are for near addresses (unless it's only
matching MEM==MEM), and the ones in the insn are far, but ...

> Reload doesn't have any concept of two different kinds of memory
> operands which can't be converted via reloads.  If the constraint
> accepts mem, and we have a mem operand, then it will always assume
> that the problem is with the address and reload it.

... this sounds like it could be a problem for me :-P


Re: reload question about unmet constraints

2015-09-14 Thread DJ Delorie

> You would need some way to indicate that while Y does accept a mem,
> this particular mem can't be reloaded to match.  We don't have a way
> to do that.

As a test, I added this API.  It seems to work.  I suppose there could
be a better API where we determine if a constrain matches various
memory spaces, then compare with the memory space of the operand, but
I can't prove that's sufficiently flexible for all targets that
support memory spaces.  Heck, I'm not even sure what to call the
macro, and 
"TARGET_IS_THIS_MEMORY_ADDRESS_RELOADABLE_TO_MATCH_THIS_CONTRAINT_P()"
is a little long ;-)

What do we think of this direction?


Index: reload.c
===
RCS file: /cvs/cvsfiles/gnupro/gcc/reload.c,v
retrieving revision 1.33
diff -p -U 5 -r1.33 reload.c
--- reload.c20 Feb 2014 16:40:26 -  1.33
+++ reload.c15 Sep 2015 05:38:24 -
@@ -3517,20 +3517,26 @@ find_reloads (rtx insn, int replace, int
 && ((reg_equiv_mem (REGNO (operand)) != 0
  && EXTRA_CONSTRAINT_STR 
(reg_equiv_mem (REGNO (operand)), c, p))
 || (reg_equiv_address (REGNO 
(operand)) != 0)))
  win = 1;
 
+#ifndef COMPATIBLE_CONSTRAINT_P
+#define COMPATIBLE_CONSTRAINT_P(c,p,op) 1
+#endif
+   if (!MEM_P (operand) || COMPATIBLE_CONSTRAINT_P (c, 
p, operand))
+ {
/* If we didn't already win, we can reload
   constants via force_const_mem, and other
   MEMs by reloading the address like for 'o'.  */
if (CONST_POOL_OK_P (operand_mode[i], operand)
|| MEM_P (operand))
  badop = 0;
constmemok = 1;
offmemok = 1;
break;
  }
+ }
if (EXTRA_ADDRESS_CONSTRAINT (c, p))
  {
if (EXTRA_CONSTRAINT_STR (operand, c, p))
  win = 1;
 

Index: config/rl78/rl78.c
===
RCS file: /cvs/cvsfiles/gnupro/gcc/config/rl78/rl78.c,v
retrieving revision 1.12.6.16
diff -p -U 5 -r1.12.6.16 rl78.c
--- config/rl78/rl78.c  5 Aug 2015 13:43:59 -   1.12.6.16
+++ config/rl78/rl78.c  15 Sep 2015 05:39:04 -
@@ -1041,10 +1041,18 @@ rl78_far_p (rtx x)
 return 0;
 
   return GET_MODE_BITSIZE (rl78_addr_space_address_mode (MEM_ADDR_SPACE (x))) 
== 32;
 }
 
+int
+rl78_compatible_constraint_p (char c, const char *p, rtx r)
+{
+  if (c == 'Y' && rl78_far_p (r))
+return 0;
+  return 1;
+}
+
 /* Return the appropriate mode for a named address pointer.  */
 #undef  TARGET_ADDR_SPACE_POINTER_MODE
 #define TARGET_ADDR_SPACE_POINTER_MODE rl78_addr_space_pointer_mode
 
 static enum machine_mode

Index: config/rl78/rl78.h
===
RCS file: /cvs/cvsfiles/gnupro/gcc/config/rl78/rl78.h,v
retrieving revision 1.7.8.3
diff -p -U 5 -r1.7.8.3 rl78.h
--- config/rl78/rl78.h  17 Mar 2015 14:54:35 -  1.7.8.3
+++ config/rl78/rl78.h  15 Sep 2015 05:39:28 -
@@ -500,5 +500,7 @@ typedef unsigned int CUMULATIVE_ARGS;
 
 /* NOTE: defined but zero means dwarf2 debugging, but sjlj EH.  */
 #define DWARF2_UNWIND_INFO 0
 
 #define REGISTER_TARGET_PRAGMAS() rl78_register_pragmas()
+
+#define COMPATIBLE_CONSTRAINT_P(C,P,OP) rl78_compatible_constraint_p (C,P,OP)


Re: reload question about unmet constraints

2015-09-15 Thread DJ Delorie

> I see.  Is it correct then to say that reload will never be able to
> change a near mem into a far mem or vice versa?  If that is true, there
> doesn't appear to be any real benefit to having both near and far mem
> operations as *alternatives* to the same insn pattern.

The RL78 has a segment register, much like the x86.  The segment
register allows you to have a 20-bit address instead of a 16-bit
address.  However, due to details of the port, you can only have *one*
segment register override per operation, even if it applies to more
than one (identical) operand.  So, you can add two near pointers, but
you can't add two different far pointers, but you can add something to
a far pointer (i.e. x += 5).

> In that case, you might be able to fix the bug by splitting the
> offending insns into two patterns, one only handling near mems
> and one handling one far mems, where the near/far-ness of the mem
> is verified by the *predicate* and not the constraints.

But this means that when reload needs to, it moves far mems into
registers, which changes which insn is matched...  It also adds a
*lot* of new patterns, since any of the three operands can be far, and
'0' constraints on far are allowed also - and most insns allow far
this way, so could be up to seven times as many patterns.

You can see why I'd rather not do that :-)


Re: reload question about unmet constraints

2015-09-16 Thread DJ Delorie

> And in fact, you should be able to decide at *expand* time which
> of the two you need for the given set of operands.

I already check for multiple fars at expand, and force all but one of
them to registers.  Somewhere before reload they get put back in.

>"rl78_virt_insns_ok () && rl78_far_insn_p (operands)"

Since when does this work reliably?  I've seen cases where insns get
mashed together without regard for validity before...

I tested just this change - adding that function to addhi3 plus the
Wfr constraint sets - and it seems to work.  The big question to me
now is - is this *supposed* to work this way?  Or is is a coincidence
that the relevent passes happen to check that function?

> The Wfr constraint must not be marked as memory constraint (so as to
> avoid reload attempting to use it to access a stack slot).

This also prevents reload from reloading the address when it *is*
needed.  However, it seems to work ok even as a memory constraint.  Is
this change *just* because of the stack slots?  Could you give an
example of how it could be misused, so I can understand the need?



Re: Repository for the conversion machinery

2015-09-16 Thread DJ Delorie

"Frank Ch. Eigler"  writes:
> That makes sense, but how many people are in cagney's shoes

I am one of those people - I have two email addresses listed in
MAINTAINERS, with two sets of copyright papers filed with the FSF (a
personal assignment and a work one).  I use the appropriate email
address for each commit depending on which maintainership role I'm
reflecting.

Neither address is "obsolete" and neither address is @gcc.gnu.org.

Using d...@gcc.gnu.org would imply that is my email address, but email
sent there would vanish.

But I did discuss my case with esr and understand it's not as easy to
solve as we'd like it to be.


Re: Repository for the conversion machinery

2015-09-17 Thread DJ Delorie

Richard Biener  writes:
>> Using d...@gcc.gnu.org would imply that is my email address, but email
>> sent there would vanish.
>
> Would it?  You're supposed to have a valid forwarding address on that.

Frank tested it and it does seem to forward to me, so I guess so.


Re: reload question about unmet constraints

2015-10-06 Thread DJ Delorie

> So in general, it's really not safe to mark a constraint that accepts
> only far memory as "memory constraint" with current reload.
> 
> Note that *not* marking the constraint as memory constraint actually
> does not prevent reload from fixing up illegitimate addresses, so you
> shouldn't really see much drawbacks from not marking it ...

While working through the regressions on this one I discovered one
seemingly important side-effect...

For such constraints that are memory operands but not
define_memory_constraint, you need to use '*' to keep reload from
trying to guess a register class from them (it guesses wrong for
rl78).

I.e. use "*Wfr" instead of "Wfr".


Re: Is anyone working on a Z80 port?

2015-10-14 Thread DJ Delorie

> I spec'd one out a long time ago for Cygnus/Red Hat, but we never 
> pursued the port.  The register model on the z80 will be problematical, 
> though some of the lessons from the rl78 port would probably be useful.

The RL78 is very much a modern decendent of the Z80 architecture so might
serve as a good starting point.

But yeah, it's a messy port because gcc doesn't like the weird
addressing model.  I ended up using a virtual ISA that gcc could deal
with, then converted that to real instructions after reload.


Proposal to deprecate: mep (Toshiba Media Processor)

2015-12-03 Thread DJ Delorie

Given a combination of "I have new responsibilities" and "nothing has
happened with mep for a long time" I would like to step down as mep
maintainer.

If someone would like to pick up maintainership of this target, please
contact me and/or the steering committee.  Otherwise, I propose this
target be deprecated in GCC 6 and removed in 7.

DJ


who owns stack args?

2016-02-24 Thread DJ Delorie

Consider this example (derived from gcc.c-torture/execute/920726-1.c):

  extern int a(int a, int b, int c, int d, int e, int f, const char *s1, const 
char *s2) __attribute__((pure));

  int
  foo()
  {
if (a(0,0,0,0,0,0,"abc","def") || a(0,0,0,0,0,0,"abc","ghi"))
  return 0;
return 1;
  }

On rl78-elf I'm seeing a bug that only happens if a() is declared
"pure".  When the bug triggers, the address of "abc" in the second
call is *not* written to the stack.  Instead, the move is deleted by
DCE in postreload.  It's not deleted if you remove the "pure".  The
bug was exposed when strcmp() became able to increment incoming stack
arguments in-place, instead of copying them to registers.

The example was intended to reproduce the bug on intel or arm, but it
doesn't.  If there's an obvious fix for this, I'm all ears, but...

The real question is: are stack arguments call-clobbered or
call-preserved?  Does the answer depend on the "pure" attribute?


Re: Proposal to deprecate: mep (Toshiba Media Processor)

2016-03-01 Thread DJ Delorie

> Given a combination of "I have new responsibilities" and "nothing has
> happened with mep for a long time" I would like to step down as mep
> maintainer.
> 
> If someone would like to pick up maintainership of this target, please
> contact me and/or the steering committee.  Otherwise, I propose this
> target be deprecated in GCC 6 and removed in 7.

MeP is now deprecated.


Re: Deprecating basic asm in a function - What now?

2016-06-20 Thread DJ Delorie

Given how many embedded ports have #defines in external packages for
basic asms for instructions such as nop, enable/disable interrupts,
other system-level opcodes, etc... I think this is a bad idea.  Even
glibc would break.

#define enable() asm("eint")

__asm__ __volatile__ ("fwait");


Re: gcc/libcpp: non-UTF-8 source or execution encodings?

2016-07-19 Thread DJ Delorie

David Edelsohn  writes:
> GCC on the system is not self-hosting -- I believe that GCC only is
> used as a cross-compiler.

I can confirm this - GCC for TPF is always a cross compiler, it never
runs *on* a TPF system.


Re: [GCC Steering Committee attention] [PING] [PING] [PING] libgomp: In OpenACC testing, cycle though $offload_targets, and by default only build for the offload target that we're actually going to te

2016-08-04 Thread DJ Delorie
Manuel Lpez-Ibñez  writes:

> none? for libiberty, no regular maintainer for build machinery,

Perhaps this is a sign that I should step down as maintainers for those?


Re: [GCC Steering Committee attention] [PING] [PING] [PING] libgomp: In OpenACC testing, cycle though $offload_targets, and by default only build for the offload target that we're actually going to te

2016-08-04 Thread DJ Delorie

Manuel Lpez-Ibñez  writes:
> I don't see how that helps. Neither my message nor Thomas's is a
> criticism of people. The question is how to get more people to help
> and how to improve the situation. For sure, everybody is doing the
> best that they can with the time that they have.

You complained that there were no libiberty maintainers (there are two)
or build maintainers (there are many).  As I am listed as one of each of
those, this makes me wonder if there's no longer a need for such people
(we're involved so infrequently that nobody notices) or that I'm just
not able to put enough effort into it to be noticed (which may be true
anyway).

Either way, this is a part of your "problem" that I can address
directly, so I'm doing so.

> This is a problem throughout GCC. We have a single C++ maintainer, a
> single part-time C maintainer, none? for libiberty, no regular
> maintainer for build machinery, and so on and so forth.


Re: [GCC Steering Committee attention] [PING] [PING] [PING] libgomp: In OpenACC testing, cycle though $offload_targets, and by default only build for the offload target that we're actually going to te

2016-08-04 Thread DJ Delorie

Manuel Lpez-Ibñez  writes:
> Another question is how to help existing maintainers such that they
> are more motivated to review patches. Is it a lack of time? lack of
> Interest in the project? do patches simply fall through the cracks? is
> it a dead-lock of people waiting for each other to comment?

In my case, I became a build/libiberty maintainer a long time ago (20
years or so), when DJGPP was a much more active project, and it made
sense for me to be involved in parts of GCC that were sensitve to the
needs of DOS and NTFS filesystems and OSs.  This is not the case any
more, and my justification for maintaining those parts of gcc have
evaporated.  Fortunately, a very minimal effort was involved to
continue.  However, since then, I've not only taken on maintainerships
in other areas (mostly backends, which I need to wean off of eventually)
but I've also switched groups at work, and am no longer focused on gcc
(I'm focused on glibc now).

Also, I've always been opposed to libiberty being a "catch-all" for
cross-useful functionality, so I'm anti-motivated to work on those
portions of libiberty that aren't strictly portability-layer-related
(specifically, the demangler, which I leave to Ian).

So... considering your big-picture-problem, where do I fit in?  How can
I make the big picture better, given what you now know about my
situtation?  What changes do you think would make sense?


Re: Why are GCC Internals not Specification Driven ?

2016-12-19 Thread DJ Delorie

Seima Rao  writes:
>  Has gcc become proprietory/commercial ?

By definition: no, yes.  It's been this way since the beginning, and
hasn't changed in decades.

>  Or has it become illegal to publish specification models
>  of gcc internals ? Does this make the product sell less ?

This sounds like you're trying to start an argument, instead of asking a
simple question.  It is certainly not illegal to publish our
specifications, and we certainly *do* publish many of our specifications
(have you read the internals manual?  You don't say whether or not you
did, but that would be a key bit of information to have disclosed).
Whether the product "sells" or not is rarely a driving factor for our
project.  Most of us work on it because we need it to work better for
our own purposes.

If you have specific questions about our documentation or development
process, please ask them.  Please do not ask vague, leading, and
emotionally loaded questions.  RTL and Gimple are documented.  Are they
documemented well?  That depends on your needs.  Are they documented as
well as they could be?  Probably not, but good enough for us so far.

And as always, if you want to improve the situation, by all means feel
free to volunteer to do so ;-)


Re: targetm.calls.promote_prototypes parameter

2017-12-06 Thread DJ Delorie

In my original proposal, I said this:

> It includes a bunch of macro->hook conversions, mostly because the
> hooks need an additional parameter (the function) to detect which ones
> are Renesas ABI and which are GCC ABI.

The original documentation at least hinted that the parameter was a
function type:

> @deftypefn {Target Hook} bool TARGET_PROMOTE_PROTOTYPES (tree @var{fntype})

Kazu's calls are in the C++ stuff, I don't know if g++ and Renesas C++
are compatible anyway (I doubt it), but that's what would be affected.
The original work was for C compatibility.



Re: targetm.calls.promote_prototypes parameter

2017-12-06 Thread DJ Delorie

Jason Merrill  writes:
> I'm inclined to change the C++ FE to pass NULL_TREE instead until such
> time as someone cares.

The sh backend will at least not choke on that ;-)


Re: Status of m32c target?

2018-01-12 Thread DJ Delorie

Jeff Law  writes:
> I was going to suggest deprecation for gcc-8 given how badly it was
> broken in gcc-7 and the lack of maintenance on the target.

As much as I use the m32c target, I have to agree.  I've tried many
times to fix its reload problems to no avail, and just don't have time
to work on gcc ports much any more.


Re: Status of m32c target?

2018-01-15 Thread DJ Delorie

Jeff Law  writes:
> A change in reload back in 2016 (IIRC) has effectively made m32c
> unusable.  The limits of the register file create horrible problems for
> reload.
>
> I was going to suggest deprecation for gcc-8 given how badly it was
> broken in gcc-7 and the lack of maintenance on the target.

I gave this another shot Friday, I was thinking maybe we could retire
the m32cm cpu and keep the r8c cpu, since the M32C family is essentially
dead part-wise but there are still new R8C chips being made.

The reload problems for r8c are still there, but I also discovered a bug
in the m32cm cpu that might be generic...

Are there any other targets that push large structures on the call stack
via memcpy?  I'm seeing failures due to mis-calculating stack
adjustments in that case.


$ m32c-elf-gcc -c -mcpu=m32cm -O3 dj.c

typedef struct {
  void *a, *b, *c, *d;
  void *e, *f, *g;
} cookie_io_functions_t;

void *_impure_ptr;

void *
_fopencookie_r (void *ptr, void *cookie, const char *mode, 
cookie_io_functions_t functions);


void *
fopencookie ( void *cookie , const char *mode , cookie_io_functions_t functions)

{
  return _fopencookie_r ( _impure_ptr , cookie, mode, functions);
}

dj.c: In function ‘fopencookie’:
dj.c:16:10: internal compiler error: in expand_call, at calls.c:4426
   return _fopencookie_r ( _impure_ptr , cookie, mode, functions);
  ^~~

  printf("%x %x, %d %d %d\n", flags, ECF_NORETURN, old_stack_allocated, 
stack_pointer_delta, pending_stack_adjust);
  /* Verify that we've deallocated all the stack we used.  */
  gcc_assert ((flags & ECF_NORETURN)
  || (old_stack_allocated
  == stack_pointer_delta - pending_stack_adjust));

IIRC when this happens, "stack_pointer_delta" doesn't account for the
size of the large-structure-argument - it has all the push'd args, but
not the memcpy'd one.  I.e. that printf I added prints this:

0 8, 0 12 40


Re: array bounds violation in caller-save.c : duplicate hard regs check added

2012-08-09 Thread DJ Delorie

> Date: Tue, 5 Jun 2012 21:59:15 -0400 (EDT)
> From: Hans-Peter Nilsson 
> On Fri, 25 May 2012, DJ Delorie wrote:
> > If I apply this patch, which checks for duplicate hard registers within
> > -fira-share-save-slots, the following *-elf targets fail due to the assert:
> >
> > bfin cris m32c rl78 rx sh sh64 v850
> 
> Oop.  An no clue as to what's wrong.
> 
> Can you pretty please make the test-case n'all sent
> down-thread into a PR?

Sorry, I dropped the ball on this one.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54217


reverse bitfield patch

2012-10-02 Thread DJ Delorie

Here's my current patch for the bitfield reversal feature I've been
working on for a while, with an RX-specific pragma to apply it
"globally".  Could someone please review this?  It would be nice
to get it in before stage1 closes again...


Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi (revision 192009)
+++ gcc/doc/extend.texi (working copy)
@@ -5427,12 +5427,74 @@ Note that the type visibility is applied
 associated with the class (vtable, typeinfo node, etc.).  In
 particular, if a class is thrown as an exception in one shared object
 and caught in another, the class must have default visibility.
 Otherwise the two shared objects will be unable to use the same
 typeinfo node and exception handling will break.
 
+@item bit_order
+Normally, GCC allocates bitfields from either the least significant or
+most significant bit in the underlying type, such that bitfields
+happen to be allocated from lowest address to highest address.
+Specifically, big-endian targets allocate the MSB first, where
+little-endian targets allocate the LSB first.  The @code{bit_order}
+attribute overrides this default, allowing you to force allocation to
+be MSB-first, LSB-first, or the opposite of whatever gcc defaults to.  The
+@code{bit_order} attribute takes an optional argument:
+
+@table @code
+
+@item native
+This is the default, and also the mode when no argument is given.  GCC
+allocates LSB-first on little endian targets, and MSB-first on big
+endian targets.
+
+@item swapped
+Bitfield allocation is the opposite of @code{native}.
+
+@item lsb
+Bits are allocated LSB-first.
+
+@item msb
+Bits are allocated MSB-first.
+
+@end table
+
+A short example demonstrates bitfield allocation:
+
+@example
+struct __attribute__((bit_order(msb))) @{
+  char a:3;
+  char b:3;
+@} foo = @{ 3, 5 @};
+@end example
+
+With LSB-first allocation, @code{foo.a} will be in the 3 least
+significant bits (mask 0x07) and @code{foo.b} will be in the next 3
+bits (mask 0x38).  With MSB-first allocation, @code{foo.a} will be in
+the 3 most significant bits (mask 0xE0) and @code{foo.b} will be in
+the next 3 bits (mask 0x1C).
+
+Note that it is entirely up to the programmer to define bitfields that
+make sense when swapped.  Consider:
+
+@example
+struct __attribute__((bit_order(msb))) @{
+  short a:7;
+  char b:6;
+@} foo = @{ 3, 5 @};
+@end example
+
+On some targets, or if the struct is @code{packed}, GCC may only use
+one byte of storage for A despite it being a @code{short} type.
+Swapping the bit order of A would cause it to overlap B.  Worse, the
+bitfield for B may span bytes, so ``swapping'' would no longer be
+defined as there is no ``char'' to swap within.  To avoid such
+problems, the programmer should either fully-define each underlying
+type, or ensure that their target's ABI allocates enough space for
+each underlying type regardless of how much of it is used.
+
 @end table
 
 To specify multiple attributes, separate them by commas within the
 double parentheses: for example, @samp{__attribute__ ((aligned (16),
 packed))}.
 
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 192009)
+++ gcc/c-family/c-common.c (working copy)
@@ -310,12 +310,13 @@ struct visibility_flags visibility_optio
 
 static tree c_fully_fold_internal (tree expr, bool, bool *, bool *);
 static tree check_case_value (tree);
 static bool check_case_bounds (tree, tree, tree *, tree *);
 
 static tree handle_packed_attribute (tree *, tree, tree, int, bool *);
+static tree handle_bitorder_attribute (tree *, tree, tree, int, bool *);
 static tree handle_nocommon_attribute (tree *, tree, tree, int, bool *);
 static tree handle_common_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
 static tree handle_hot_attribute (tree *, tree, tree, int, bool *);
 static tree handle_cold_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
@@ -601,12 +602,14 @@ const unsigned int num_c_common_reswords
 const struct attribute_spec c_common_attribute_table[] =
 {
   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler,
affects_type_identity } */
   { "packed", 0, 0, false, false, false,
  handle_packed_attribute , false},
+  { "bit_order",  0, 1, false, true, false,
+ handle_bitorder_attribute , false},
   { "nocommon",   0, 0, true,  false, false,
  handle_nocommon_attribute, false},
   { "common", 0, 0, true,  false, false,
  handle_common_attribute, false },
   /* FIXME: logically, noreturn attributes should be listed as
  "false, true, true" and apply to function types.  But implementing this
@

Re: reverse bitfield patch

2012-10-02 Thread DJ Delorie

[sorry, should have gone to gcc-patches]


Re: reverse bitfield patch

2012-10-04 Thread DJ Delorie

> ChangeLog missing, new functions need a toplevel comment documenting
> function, argument and return value as per coding conventions.

Any review of the patch itself?  I know the overhead is not there...


Re: reverse bitfield patch

2012-10-05 Thread DJ Delorie


> Why do you need to change varasm.c at all?  The hunks seem to be
> completely separate of the attribute.

Because static constructors have fields in the original order, not the
reversed order.  Otherwise code like this is miscompiled:

struct foo a = { 1, 2, 3 };

because the 1, 2, 3 are in the C layout order, but the underlying data
needs to be stored in the reversed order.

> which will severely pessimize bitfield accesses to structs with the
> bitfield-order attribute.

The typical use-case for this feature is memory-mapped hardware, where
pessimum access is preferred anyway.

> so you are supporting this as #pragma.  Which ends up tacking
> bit_order to each type.  Rather than this, why not operate similar
> to the packed pragma, thus, adjust a global variable in
> stor-layout.c.

Because when I first proposed this feature, I was told to do it this
way.

> I don't see a value in attaching 'native' or 'msb'/'lsb' if it is
> equal to 'native'.  You un-necessarily pessimize code generation (is
> different code even desired for a "no-op" bit_order attribute?).

If the attribute corresponds to the native mode, it should be a no-op.
The pessimizing only happens when the fields are actually in reverse
order.

> So no, I don't like this post-process layouting thing.  It's a
> layouting mode so it should have effects at bitfield layout time.

The actual reversal happens in stor-layout.c.  Everything else is
there to compensate for a possible non-linear layout.


Re: Time for GCC 5.0? (TIC)

2012-11-05 Thread DJ Delorie

Ian Lance Taylor  writes:
> Also the fact that GCC is now written in C++ seems to me to be
> deserving of a bump to 5.0.

I see no reason why an internal design change that has no user visible
effects should have any impact on the version number.

Typically a major version bump is reserved for either massive new
functionality or a break with backwards compatibility.


Re: Time for GCC 5.0? (TIC)

2012-11-26 Thread DJ Delorie

> Marketing loves high numbers after all!

If you truly think this way, we're going to have to revoke your hacker's 
license ;-)


Re: Deprecate i386 for GCC 4.8?

2012-12-18 Thread DJ Delorie

The official DJGPP triplet is for i586, not i386.  I don't mind
djgpp-wise if we deprecate i386, as long as we keep i586.  Anyone
still using djgpp for i386 can dig out old versions from the archives :-)


Re: ADDR_SPACE_CONVERT_EXPR always expanded to 0?

2013-03-12 Thread DJ Delorie

> A quick grep shows not many targets would be affected, AVR, m32c, rl78 and 
> spu.
> You should work with the maintainers of those targets to see which approach
> would be the best.

For both m32c and rl78, one address space is a strict subset of the
other (16-bit "near" vs 20/24/32-bit "far" pointers, that's all) so
nothing magic there.


Re: GCC 4.8.0 does not compile for DJGPP

2013-03-23 Thread DJ Delorie

The DJGPP build of gcc 4.8.0 was just uploaded, it might have some
patches that haven't been committed upstream yet.


Re: If you had a month to improve gcc build parallelization, where would you begin?

2013-04-03 Thread DJ Delorie

One thing I did in libiberty was to rearrange the targets so that the
ones that took the longest started first.  That way, you don't end up
building 99% of the objects then waiting for the one last one to finish.


gettext prereq vs po/zh_TW

2013-07-15 Thread DJ Delorie

The gcc prereq page says gettext 0.14.5 is the minimum version, but
po/zh_TW.po has lines like this:

#, fuzzy
#~| msgid "Unexpected EOF"
#~ msgid "Unexpected type..."
#~ msgstr "未預期的型態…"

The | syntax appears to have been added in gettext 0.16, and gettext
0.14 can't process it.

Seems to have been a result of this request:

http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01436.html


Re: Question about local register variable

2013-08-28 Thread DJ Delorie

The purpose of local register variables is to tell gcc which register to
use in an inline asm, when multiple registers could be used.  Other uses
are not supported and usually don't work the way you expect, especially
when optimizing.

If all you want is a function which returns the value in a specific
register, you could try using an asm like this with a local register
variable:

  __asm__ __volatile__ ("# no actual opcode" : "=r" (r));

That tells gcc that something has put a value in 'r' (although nothing
did, so the old value is used).  However, usually such registers have
global-specific values, so a global register variable is a better
choice.  If the register may be used by gcc for other purposes, you have
no way of predicting what might be in it when that asm happens.


Re: DJ Delorie and Nick Clifton appointed as MSP430 port maintainers

2013-09-12 Thread DJ Delorie

On behalf of myself and Nick, many thanks to everyone involved in
reviewing this port!  I've checked in the port as per the last
(approved) patch set I sent out.


Re: Invalid tree node causes segfault in diagnostic

2013-10-11 Thread DJ Delorie

> While I am at it, can I patch backends as well? For example
> mep/mep.c has an occurrence of tree_code_name[TREE_CODE (...

The mep change is pre-approved :-)


question about register pairs

2013-10-23 Thread DJ Delorie

The docs say to use HARD_REGNO_MODE_OK to enforce register pairs.

But reload (find_valid_class_1) rejects classes that include such
registers:

  for (regno = 0; regno < FIRST_PSEUDO_REGISTER && !bad; regno++)
{
  if (in_hard_reg_set_p (reg_class_contents[rclass], mode, regno)
  && !HARD_REGNO_MODE_OK (regno, mode))
{
  bad = 1;

In the past, if I use a register class that excludes the second half
of register pairs, it can't do anything because it requires both parts
of the register pair to be in the class (example: in_hard_reg_set_p
checks this).

Which way is the "right" way?


proposal to make SIZE_TYPE more flexible

2013-10-29 Thread DJ Delorie

There are a couple of places in gcc where wierd-sized pointers are an
issue.  While you can use a partial-integer mode for pointers, the
pointer *math* is still done in standard C types, which usually don't
match the modes of pointers and often result in suboptimal code.

My proposal is to allow the target to define its own type for
pointers, sizeof_t, and ptrdiff_t to use, so that gcc can adapt to
weird pointer sizes instead of the target having to use power-of-two
pointer math.

This means the target would somehow have to register its new types
(int20_t and uint20_t in the MSP430 case, for example), as well as
specify that type to the rest of gcc.  There are some problems with
the naive approach, though, and the changes are somewhat pervasive.

So my question is, would this approach be acceptable?  Most of the
cases where new code would be added, have gcc_unreachable() at the
moment anyway.

Specific issues follow...

SIZE_TYPE is used...

tree.c: build_common_tree_nodes()
compares against fixed strings; if no match it
gcc_unreachable()'s - instead, look up type in
language core.

c-family/c-common.c
just makes a string macro, it's up to the target to
provide a legitimate value for it.

lto/lto-lang.c
compares against fixed strings to define THREE related
types, including intmax_type_node and
uintmax_type_node.  IMHO it should not be using
pointer sizes to determine integer sizes.

PTRDIFF_TYPE is used...

c-family/c-common.c
fortran/iso-c-binding.def
fortran/trans-types.c

These all use lookups; however fortran's
get_typenode_from_name only supports "standard" type
names.

POINTER_SIZE is used...

I have found in the past that gcc has issues if POINTER_SIZE
is not a power of two.  IIRC BLKmode was used to copy pointer
values.  Other examples:

  assemble_align (POINTER_SIZE);

can't align to non-power-of-two bits

   assemble_integer (XEXP (DECL_RTL (src), 0),
 POINTER_SIZE / BITS_PER_UNIT, POINTER_SIZE, 1);

need to round up, not truncate



Re: proposal to make SIZE_TYPE more flexible

2013-10-30 Thread DJ Delorie

> It is a deficiency that SIZE_TYPE is defined to be a string at all (and 
> likewise for all the other target macros for standard typedefs including 
> all those for ).  Separately, it's a deficiency that these 
> things are target macros rather than target hooks.

My thought was that there'd be a set of target hooks that returned a
TREE for various types.  But as an interim solution, the checks that
use strcmp() should fail into a type lookup-by-name.  I.e. replace the
gcc_unreachables with expensive table lookups.

I think there's an advantage in newlib to having a macro that expands
to the type-as-a-string needed for various types.  Of course, if gcc
had a typedef for those types, that would be better, but slightly
harder to autodetect.

> Instead of having an __int128 keyword (target-independent) targets
> would be able to define a set of N for which there are __intN
> keywords and for which everything handling __int128 will equally
> handle __intN.

That sounds great, and would elide the problem of a target needing to
register their own types for those.  I hadn't considered simplifying
the intN_t problem, since you *can* register custom types, I was
mostly just thinking about how to *use* those types for size_t et al.


Re: proposal to make SIZE_TYPE more flexible

2013-10-30 Thread DJ Delorie

So, given all that, is there any way to add the "target-specific
size_t" portion without waiting for-who-knows-how-long for the intN_t
and enum-size-type projects to finish?  Some form of interim API that
we can put in, so that we can start working on finding all the
assumptions about size_t, while waiting for the rest to finish?



weird logic about redeclaring builtins...

2013-10-31 Thread DJ Delorie

Given the logic in c/c-decl.c's diagnose_mismatched_decls, if a
built-in function is *also* declared in a system header (which is
common with newlib), gcc fails to mention either the builtin or the
declaration if you redeclare the function as something else.

I.e. this code:

int foo();
int foo;

gives the expected "previous declaration was at ..." error, and this
code:

int index;

gives the expected "built-in function 'index' declared ..." error.
However, this code:

char *index(const char *,int);
int index;

gives neither the built-in error nor the previous-decl error.  It
*only* gives the "'index' was redeclared" error.

Is this intentional?  Is there an easy fix for this that works for all
cases?


Re: proposal to make SIZE_TYPE more flexible

2013-11-13 Thread DJ Delorie

> > So, given all that, is there any way to add the "target-specific
> > size_t" portion without waiting for-who-knows-how-long for the intN_t
> > and enum-size-type projects to finish?  Some form of interim API that
> > we can put in, so that we can start working on finding all the
> > assumptions about size_t, while waiting for the rest to finish?
> 
> I have no idea how ugly something supporting target-specific strings would 
> be, since supporting such strings for these standard typedefs never seemed 
> to be a direction we wanted to go in.

I tried to hack in support for intN_t in a backend, and it was a maze
of initialization sequence nightmares.  So I guess we need to do the
intN_t part first.  Is someone working on this?  If not, is there a
spec I could use to get started on it?


Re: proposal to make SIZE_TYPE more flexible

2013-11-14 Thread DJ Delorie

> Instead of a target-independent __int128 keyword, there would be a set 
> (possibly empty) of __intN keywords, determined by a target hook.  

Or *-modes.def ?


Re: proposal to make SIZE_TYPE more flexible

2013-11-14 Thread DJ Delorie

> That would be one possibility - if the idea is to define __intN for all 
> integer modes not matching a standard type (and passing 
> targetm.scalar_mode_supported_p), I advise posting details of what effect 
> this would have for all targets so we can see how many such types would 
> get added.

I was thinking of using the existing PARTIAL/FRACTIONAL_INT_MODE macros.

avr/avr-modes.def:FRACTIONAL_INT_MODE (PSI, 24, 3);
bfin/bfin-modes.def:PARTIAL_INT_MODE (DI, 40, PDI);
m32c/m32c-modes.def:PARTIAL_INT_MODE (SI, 24, PSI);
msp430/msp430-modes.def:PARTIAL_INT_MODE (SI, 20, PSI);
rs6000/rs6000-modes.def:PARTIAL_INT_MODE (TI, 128, PTI);
sh/sh-modes.def:PARTIAL_INT_MODE (SI, 22, PSI);
sh/sh-modes.def:PARTIAL_INT_MODE (DI, 64, PDI);

I suspect we'd have to filter out the power-of-two PSI ones though, leaving:

avr/avr-modes.def:FRACTIONAL_INT_MODE (PSI, 24, 3);
bfin/bfin-modes.def:PARTIAL_INT_MODE (DI, 40, PDI);
m32c/m32c-modes.def:PARTIAL_INT_MODE (SI, 24, PSI);
msp430/msp430-modes.def:PARTIAL_INT_MODE (SI, 20, PSI);
sh/sh-modes.def:PARTIAL_INT_MODE (SI, 22, PSI);

I'm assuming we need a mode to go with any type we create?  Otherwise,
we could add a FRACTIONAL_INT_TYPE(wrapper-mode, bits) macro to add
yet more.


Re: proposal to make SIZE_TYPE more flexible

2013-11-14 Thread DJ Delorie

> If you do want types without corresponding modes, that goes back to
> having a hook to list the relevant type sizes.

Perhaps a FRACTIONAL_INT_TYPE() macro then, for when there's no
machine mode to go with it?  Although I'm struggling to imagine a case
where a target would need to define a bit-sized type that doesn't
correspond to any machine mode.


Re: proposal to make SIZE_TYPE more flexible

2013-11-15 Thread DJ Delorie

> Everything handling __int128 would be updated to work with a 
> target-determined set of types instead.
> 
> Preferably, the number of such keywords would be arbitrary (so I suppose 
> there would be a single RID_INTN for them) - that seems cleaner than the 
> system for address space keywords with a fixed block from RID_ADDR_SPACE_0 
> to RID_ADDR_SPACE_15.

I did a scan through the gcc source tree trying to track down all the
implications of this, and there were a lot of them, and not just the
RID_* stuff.  There's also the integer_types[] array (indexed by
itk_*, which is its own mess) and c_common_reswords[] array, for
example.

I think it might not be possible to have one RID_* map to multiple
actual keywords, as there are few cases that need to know *which* intN
is used *and* have access to the original string of the token, and
many cases where code assumes a 1:1 relation between RID_*, a type,
and a keyword string.

IMHO the key design choices come down to:

* Do we change a few global const arrays to be dynamic arrays?

* We need to consider that "position in array" is no longer a suitable
  sort key for these arrays.  itk_* comes to mind here, but RID_* are
  abused sometimes too.  (note: I've seen this before, where PSImode
  isn't included in "find smallest mode" logic, for example, because
  it's no in the array in the same place as SImode)

* Need to dynamically map keywords/bitsizes/tokens to types in all the
  cases where we explicitly check for int128.  Some of these places
  have explicit "check types in the right order" logic hard-coded that
  may need to be changed to a data-search logic.

* The C++ mangler needs to know what to do with these new types.

I'll attach my notes from the scan for reference...


Search for in128 ...
Search for c_common_reswords ...
Search for itk_ ...

--- . ---

tree-core.h

enum integer_type_kind is used to map all integer types "in
order" so we need an alternate way to map them.  Currently hard-codes
the itk_int128 types.

tree.h

defines int128_unsigned_type_node and int128_integer_type_node

uses itk_int128 and itk_unsigned_int128 - int128_*_type_node
is an [itk_*] array reference.

builtin-types.def

defines BT_INT182 but nothing uses it yet.

gimple.c

gimple_signed_or_unsigned_type maps types to their signed or
unsigned variant.  Two cases: one checks for int128
explicitly, the other checks for compatibility with int128.

tree.c

make_or_reuse_type maps size/signed to a
int128_integer_type_node etc.

build_common_tree_nodes makes int128_*_type_node if the target
supports TImode.

tree-streamer.c

preload_common_nodes() records one node per itk_*

--- LTO ---

lto.c

read_cgraph_and_symbols() reads one node per integer_types[itk_*]

--- C-FAMILY ---

c-lex.c

intepret_integer scans itk_* to find the best (smallest) type
for integers.

narrowest_unsigned_type assumes integer_types[itk_*] in
bit-size order, and assumes [N*2] is signed/unsigned pairs.

narrowest_signed_type: same.

c-cppbuiltin.c

__SIZEOF_INTn__ for each intN

c-pretty-print.c

prints I128 suffix for int128-sized integer literals.

c-common.c

int128_* has an entry in c_global_trees[]

c_common_reswords[] has an entry for __int128 -> RID_INT128

c_common_type_for_size maps int:128 to  int128_*_type_node

c_common_type_for_mode: same.

c_common_signed_or_unsigned_type - checks for int128 types.
same as igmple_signed_or_unsigned_type?()

c_build_bitfield_integer_type assigns int128_*_type_node for
:128 fields.

c_common_nodes_and_builtins maps int128_*_type_node to
RID_INT128 and "__int128".  Also maps to decl __int128_t

keyword_begins_type_specifier() checks for RID_INT128

--- C ---

c-tree.h

adds cts_int128 to c_typespec_keyword[]

c-parser.c

c_parse_init() reads c_common_reswords[] which has __int128,
maps one id to each RID_* code.

c_token_starts_typename() checks for RID_INT128

c_token_starts_declspecs() checks for RID_INT128

c_parser_declspecs() checks for RID_INT128

c_parser_attribute_any_word() checks for RID_INT128

c_parser_objc_selector() checks for RID_INT128

c-decl.c

error for "long __int128" etc throughout

declspecs_add_type() checks for RID_INT128

finish_declspecs() checks for cts_int128

--- FORTRAN ---

ico-c-binding.def

maps int128_t to c_int128_t via get_int_kind_from_width(

--- C++ ---

class.c

layout_class_types uses itk_* to find the best (smallest)
integer type for overlarge bitfields.

lex.c

init_reswords() reads c_common_reswords[], which includes __int128

rtti.c

emit_support_tinfos has a dummy list of types fund

  1   2   3   4   5   6   7   8   >