Avoiding subregs of PSImode when word_mode is HImode

2012-06-20 Thread Peter Bigot
I'm running into issues with the MSP430 and use of PSImode in packed
structures.  Here the native word size is HImode, but all general purpose
registers are capable of holding and operating on PSImode (=20 bit) values.

Due to instruction set limitations, on this architecture:

  (subreg:HI (reg:PSI 0))

is fine, but

  (subreg:HI (reg:PSI 2))

is inexpressible.  The upper 4 bits can only be accessed with multiple shift
instructions.

CANNOT_CHANGE_MODE_CLASS doesn't provide the control to express this
restriction on subregs.  The only way I've found to tell gcc this is to hack
validate_subreg to rule out non-zero offsets on PSImode.

Is there a non-hack way to tell GCC about this limitation?

More generally, is there a way to tell GCC not to try to decompose SImode
operations into subregs even though word_mode is HImode?  I've got
HARD_REGNO_NREGS, HARD_REGNO_NREGS_HAS_PADDING, and REGMODE_NATURAL_SIZE set
up as accurately as I can:

#define MSP430_MODE_NREGS(MODE) \
  (PSImode == (MODE)\
   ? 1  \
   : ((GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD))
#define HARD_REGNO_NREGS(REGNO, MODE) MSP430_MODE_NREGS(MODE)
#define HARD_REGNO_MODE_OK(REGNO, MODE) 1
#define MODES_TIEABLE_P(MODE1, MODE2)   \
  ((PSImode != (MODE1) || PSImode == (MODE2))   \
   && (PSImode != (MODE2) || 1 == MSP430_MODE_NREGS (MODE1)))
#define HARD_REGNO_NREGS_HAS_PADDING(REGNO, MODE) (PSImode == (MODE))
#define HARD_REGNO_NREGS_WITH_PADDING(REGNO, MODE) (PSImode == (MODE)
? 2 : MSP430_MODE_NREGS(MODE))
#define REGMODE_NATURAL_SIZE(MODE) (PSImode == (MODE) ?
2*UNITS_PER_WORD : UNITS_PER_WORD)

Thanks.

Peter


gcc enhancements for TI MSP430 and other chips with target-specific address/integer modes

2012-06-06 Thread Peter Bigot
Over the last couple months I've been updating an out-of-tree back end for
the TI MSP430 to support 20-bit functionality.  This is a 16-bit
microcontroller, where certain MCUs support 20-bit address and integer
operations with an extended ISA and registers that can hold slightly more
than one word.

GCC is not particularly ready to support such a chip, at least not with the
architectural features I wanted.  Some of the problems I encountered are
described at:

https://sourceforge.net/apps/mediawiki/mspgcc/index.php?title=Gcc47:20-Bit_Design#mixedmode

My thanks to Bernd Schmidt and Paulo Matos for providing work-in-progress on
their enhancements to address similar issues.  Although neither approach
suited my needs, it was very helpful to see what they'd done.

20-bit capable MSP430 support in gcc is now available out-of-tree; see
http://www.mail-archive.com/mspgcc-users@lists.sourceforge.net/msg10989.html.
The relevance to gcc is that I needed to make a variety of changes to gcc
proper to support this architecture.  These include:

 * [5f4ab69e90] Correction of some uses of Pmode where FUNCTION_MODE was
   appropriate ;

 * [03e6eb6cdb9, others] Correction of some uses of GET_MODE_BITSIZE where
   GET_MODE_PRECISION was required;

 * [ac85f5a18] Addition of target hooks that allow choice of different
   pointer and address modes in the same program, based on type trees.
   Since named address spaces aren't legal in C++, and aren't supported for
   functions, the existing capability could not be used.  The changes
   include a refactoring of the current named address space infrastructure
   to obtain the address space from the type tree;

 * [822bc1e12d] Store the actual pointer and address modes in mem_attrs
   instead of assuming they can be reconstructed from the address space
   alone;

 * [803171c91e; 1eee7ac13b] Addition of target hooks for finer control of
   declaration placement in specific sections;

 * [5f0473808b] Addition of target hooks that permit a target-specific
   integer type to be used for SIZE_TYPE and PTRDIFF_TYPE.  Though this is a
   gross hack, gcc's normal solution of promoting the internal 20-bit
   MODE_PARTIAL_INT PSImode to SImode is unworkable: conversion between
   PSImode and SImode is an extremely expensive operation.

 * [ffb61fc7ed] Allow selection of an exact-match MODE_PARTIAL_INT mode when
   smallest_mode_for_size is asked for MODE_INT satisfying a particular
   precision.

Since none of these issues could be demonstrated in upstream gcc, I only
filed bugzilla reports for the ones that seemed obviously wrong.

At this time, integration of these changes upstream is out of scope, but I
thought other gcc developers might be interested in seeing the patches,
especially if they need to support a similar target.

The MSP430 git repository for gcc can be browsed at
http://mspgcc.git.sourceforge.net/git/gitweb.cgi?p=mspgcc/gcc;a=summary and
the readme on that page points to a description of the branching conventions
used in the repository.  The patches that might be of interest to upstream
gcc are in the commits identified in brackets in the list above.

Thanks to everybody who's helped with this over the past several months.

Peter


Re: gcc doesn't accept specs options anymore

2012-05-07 Thread Peter Bigot
On Mon, May 7, 2012 at 6:08 AM, Christian Bruel  wrote:
>
>
> On 05/07/2012 12:09 PM, Joseph S. Myers wrote:
>> On Mon, 7 May 2012, Christian Bruel wrote:
>>
>>> Making the driver aware about all possible user defined options seems
>>> unpredictable. Was there any justification on removing this
>>> functionality or did I miss a point with the EXTRA_SPECS ?
>>
>> There are several motivations behind requiring all options to be defined
>> in .opt files, including:
>>
>> * For multilibs to be selected based on the semantics of options, using
>> values set in gcc_options structures by the same code as in cc1, rather
>> than by textual matching attempting to replicate semantics, the driver
>> needs to understand the semantics of options as similarly as possible to
>> cc1, rather than treating any options purely textually.
>>
>> * Every option supported by the compiler should be listed in --help (and
>> if the missing help information were all filled it, we could then make it
>> a build failure to have an option without help information).
>
> True but this removes the flexibility for a user or a BSP maintainer to
> define new options, e.g to the linker not the compiler, without access
> to the compiler sources using a --spec= file..
>
>>
>> * Structured option information enables consistency in how options are
>> processed and errors given for unknown options or arguments.
>>
>> * It would be useful for the compiler to be able to export structured
>> information about all its options for use by tools such as IDEs.
>
> If the option is only supported by a BSP, and not by the compiler, I
> don't see how the compiler could report it since it doesn't depend on
> static information known at build time.
> A direction would be to add this information in the user spec rules
>
> *ldruntime:
> + %{foo: -lfoo} %{help: "describe foo "}
>
> I'm not aware about such machinery. maybe an idea of improvement ?
>
>>
>> There is certainly room for more extensibility in option handling -
>> ideally there would not be one big enum with OPT_* values for all options
>> and one header with all the macros and structures, but instead front-end
>> and back-end options would use some form of separate namespace for their
>> options so the language- and target-independent compiler doesn't see the
>> options for other parts of the compiler; that fits into the modularity
>> theme that ideally it would be possible to build multiple front ends and
>> back ends into the compiler at once, or to build front ends and back ends
>> separately from the compiler.  But defining options through use in specs
>> wouldn't be part of that; rather more structured information about each
>> option would need to be provided somehow by a separately built front end
>> or back end.
>>
>> If you want -m options to select arbitrary board support packages (and the
>> existing ability to use -T to name a linker script isn't sufficient), then
>> a -mbsp= option, whose argument is not interpreted by GCC but may be
>> processed by whatever specs you are adding after GCC is installed, would
>> seem better than lots of separate -m options.
>
> I don't like this -mbsp= alternative a lot, seems confusing, not
> elegant, and not general for other uses (could be a runtime
> customization, not bsp).
> What about delimiters, something like --start-specs ... --end-specs ?
>
>>
>> As for options in specs included with GCC: they are all meant to be in the
>> .opt files.  I went through all the specs in all the config/ headers in
>> GCC and added options found to .opt files before disallowing options not
>> included in .opt files, but as there are about 500 such headers it's quite
>> possible I missed some specs-defined options in the process.
>
> yes it looks ok for the GCC specs, the problem is for the user spec
> files. This is a new legacy issue, I thought it was worth to either
> report it, and see if this need/can be fixed.

I think http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49858 is
essentially this issue.  It can probably be closed as "won't fix",
though I notice the spec file format is still documented in the user
manual.

Peter

>
> many thanks
>
> Christian
>
>
>>


Re: making sizeof(void*) different from sizeof(void(*)())

2012-05-02 Thread Peter Bigot
On Wed, May 2, 2012 at 8:08 AM, Paulo J. Matos  wrote:
> On 30/04/12 13:01, Peter Bigot wrote:
>>
>> I would like to see the technical details, if your code is released
>> somewhere.
>>
>
> Hi Peter,
>
> Sorry for the delay.
> The code is not released, however I can send you a patch against GCC 4.6.3
> sources (our GCC 4.7.0 port is not yet stable) of our changes and will also
> try to explain how it works.

Thanks; I'd appreciate it.

>> Without having started it yet, I'm thinking this can be done by
>> modifying build_pointer_type to generalize the
>> TARGET_ADDR_SPACE_POINTER_MODE to TARGET_TYPE_POINTER_MODE, pass it
>> the whole type instead of just the address space field, and moving
>> TARGET_ADDR_SPACE_POINTER_MODE support to the default implementation
>> for that hook. Likewise for build_reference_type.  Then judicious
>> application of attributes to types and decls would allow detection of
>> the situation where a non-standard pointer size is needed.  I'm hoping
>> there aren't too many other places where that work would get undone.
>>

I've had pretty good success with the above approach, involving the
following changes:

* Eliminate some gratuitous passing of function expressions through
  memory_address(), which insists on treating everything as though it was in
  ADDR_SPACE_GENERIC and therefore forces a conversion to Pmode; also fix
  one use of Pmode which probably should have been FUNCTION_MODE back when
  it was added by rms in 1992.

* Provide new TARGET_TYPE_* hooks paralleling TARGET_ADDR_SPACE_* so that
  the appropriate pointer and address modes can examine the whole type tree,
  rather than assuming the address space is sufficient.  This provides
  access to attributes that influence the selection of appropriate mode,
  which I need for both data and function types.

* Cache the desired pointer_mode and address_mode values in struct
  mem_attrs instead of assuming addrspace is sufficient to recalculate them.

All in all, not too painful.  These'll be in the mspgcc git repository
for gcc at 
http://mspgcc.git.sourceforge.net/git/gitweb.cgi?p=mspgcc/gcc;a=summary
in a couple weeks when I do another release.  Dunno whether it's worth
considering them for trunk sometime.

> As you will see, I haven't used anything related to address spaces feature
> in GCC.

Yeah, the fact that address spaces are ignored for function types, and
apparently aren't available in C++, makes them useless for my needs
even though the support infrastructure is very similar to what I
wanted.

Peter


Re: making sizeof(void*) different from sizeof(void(*)())

2012-04-30 Thread Peter Bigot
On Mon, Apr 30, 2012 at 4:50 AM, Robert Dewar  wrote:
> On 4/30/2012 4:16 AM, Paulo J. Matos wrote:
>>
>> Peter,
>>
>> We have a working backend for an Harvard Architecture chip where
>> function pointer and data pointers have necessarily different sizes. We
>> couldn't do this without changing GCC itself in strategic places and
>> adding some extra support in our backend. We haven't used address spaces
>> or any other existing GCC solution.

I would like to see the technical details, if your code is released somewhere.

Without having started it yet, I'm thinking this can be done by
modifying build_pointer_type to generalize the
TARGET_ADDR_SPACE_POINTER_MODE to TARGET_TYPE_POINTER_MODE, pass it
the whole type instead of just the address space field, and moving
TARGET_ADDR_SPACE_POINTER_MODE support to the default implementation
for that hook. Likewise for build_reference_type.  Then judicious
application of attributes to types and decls would allow detection of
the situation where a non-standard pointer size is needed.  I'm hoping
there aren't too many other places where that work would get undone.

>
> Sounds like a useful set of changes to have in the main sources, since
> this is hardly a singular need!

Yes.  Is there an existing bug/enhancement report for this capability?

Peter


Re: making sizeof(void*) different from sizeof(void(*)())

2012-04-29 Thread Peter Bigot
On Sun, Apr 29, 2012 at 7:51 AM, Georg-Johann Lay  wrote:
> Peter Bigot a écrit:
>
>> The MSP430's split address space and ISA make it expensive to place
>> data above the 64 kB boundary, but cheap to place code there.  So I'm
>> looking for a way to use HImode for data pointers, but PSImode for
>> function pointers.  If gcc supports this, it's not obvious how.
>>
>> I get partway there with FUNCTION_MODE and some hacks for the case
>> where the called object is a symbol, but not when it's a
>> pointer-to-function data object.
>
>
> I don't think it's a good solution to use different pointer sizes.
> You will run into all sorts of trouble -- both in the application and
> in GCC.

Seems to be true in gcc at least.  In C, I thought void* and void(*)()
were so fundamentally incompatible that it was perfectly legitimate to
have them be different sizes.  (It was memory models on the DEC Alpha
a decode or more ago that I recall as having that feature.)

Are there really no gcc back-ends that take advantage of this?

> Besides that, named address spaces are for data, not for functions.
> And they are supported for C only, not for C++, Objective-C, etc.

In that case, I'll ignore them and move to using variable and function
attributes.  It just seemed nice that it was possible to adapt the
pointer mode based on the address space; it'd be nicer if it could be
modified based on the full type, though.  TARGET_TYPE_POINTER_MODE
maybe.

>
>> The only candidate solution I've seen (and haven't yet tried) is to
>> somehow assign all functions to be in a special named address space
>> and use TARGET_ADDR_SPACE_POINTER_MODE to override the normal use of
>> ptr_mode in build_pointer_type.
>>
>> I haven't been able to identify an existing back-end that has such a
>> case, though I'd swear I've seen memory models like this for some
>> other processor in the past.
>>
>> Could somebody suggest a workable solution for 4.7.x?
>
>
> AVR has similar problems to solve, see
> http://gcc.gnu.org/ml/gcc/2012-01/msg00365.html
>
> On AVR, the problem only arises for functions outside 128 KiB
> because instructions are always 2-byte aligned. Consequently,
> function pointers hold word-addresses.
>
> An easy solution would be to align function entry points and
> labels of computed and non-local goto to 2^n bytes. That way
> 16-bit pointers can target addresses in 2^n * 64 KiB.
>
> Other approach is linker stubs as mantioned in the link above.

MSP430 has direct support for calling with a 20-bit pointer, so going
through an indirection step is suboptimal.  Also, a memory model
supporting 16-bit data pointers and 20-bit code pointers is standard
for commercial compilers for MSP430, so I'm a bit surprised at the
possibility that gcc wouldn't support it.  I assumed I just wasn't
finding the right tickle-points.

Peter


making sizeof(void*) different from sizeof(void(*)())

2012-04-28 Thread Peter Bigot
The MSP430's split address space and ISA make it expensive to place
data above the 64 kB boundary, but cheap to place code there.  So I'm
looking for a way to use HImode for data pointers, but PSImode for
function pointers.  If gcc supports this, it's not obvious how.

I get partway there with FUNCTION_MODE and some hacks for the case
where the called object is a symbol, but not when it's a
pointer-to-function data object.  As an example, bootstrapping fails
in libgcc/unwind-dw2-fde.c because the fde_compare pointer-to-function
object is stored in a HImode register instead of a PSImode register.

The only candidate solution I've seen (and haven't yet tried) is to
somehow assign all functions to be in a special named address space
and use TARGET_ADDR_SPACE_POINTER_MODE to override the normal use of
ptr_mode in build_pointer_type.

I haven't been able to identify an existing back-end that has such a
case, though I'd swear I've seen memory models like this for some
other processor in the past.

Could somebody suggest a workable solution for 4.7.x?

Thanks.

Peter


Re: locating unsigned type for non-standard precision

2012-04-27 Thread Peter Bigot
On Fri, Apr 27, 2012 at 4:29 AM, Georg-Johann Lay  wrote:
> Richard Guenther wrote:
>> [PR c/51527]
>>
>> I think the fix would be sth like
>>
>> Index: gcc/convert.c
>> ===
>> --- gcc/convert.c       (revision 186871)
>> +++ gcc/convert.c       (working copy)
>> @@ -769,6 +769,7 @@ convert_to_integer (tree type, tree expr
>>                    (Otherwise would recurse infinitely in convert.  */
>>                 if (TYPE_PRECISION (typex) != inprec)
>>                   {
>> +                   tree otypex = typex;
>>                     /* Don't do unsigned arithmetic where signed was wanted,
>>                        or vice versa.
>>                        Exception: if both of the original operands were
>> @@ -806,10 +807,11 @@ convert_to_integer (tree type, tree expr
>>                       typex = unsigned_type_for (typex);
>>                     else
>>                       typex = signed_type_for (typex);
>> -                   return convert (type,
>> -                                   fold_build2 (ex_form, typex,
>> -                                                convert (typex, arg0),
>> -                                                convert (typex, arg1)));
>> +                   if (TYPE_PRECISION (otypex) == TYPE_PRECISION (typex))
>> +                     return convert (type,
>> +                                     fold_build2 (ex_form, typex,
>> +                                                  convert (typex, arg0),
>> +                                                  convert (typex, arg1)));
>>                   }
>>               }
>>           }
>
> Thanks for the patch.
>
> I bootstrapped and regression-tested on i686-pc-linux-gnu.
>
> If it's ok with you I'd go ahead and install it.
>
> And maybe Peter could tell if it also fixes the issue on his platform.

It does (though so did moving the original test down past the last
change to typex).

I've switched mspgcc to use the version you committed.

Peter

>
> Johann


Re: locating unsigned type for non-standard precision

2012-04-24 Thread Peter Bigot
On Tue, Apr 24, 2012 at 9:01 AM, Richard Guenther
 wrote:
> On Tue, Apr 24, 2012 at 3:50 PM, Peter Bigot  wrote:
>> I've run into another issue supporting a 20-bit integer for which I'd
>> appreciate a hint.  With this code:
>>
>>   typedef long int __attribute__((__a20__)) int20_t;
>>   int20_t xi;
>>   int20_t addit () { xi += 0x54321L; }
>>
>> xi ends up in mode PSImode, which is a MODE_PARTIAL_INT with 20 bits of
>> precision and 32-bit width.
>>
>> convert() notices that, because the constant in the add expression is
>> SImode, there's an SImode add being truncated to a PSImode result, and
>> pushes the truncation down into the operands.
>>
>> The problem is this ends up in convert_to_integer, which detects that the
>> signed operation might overflow so calls unsigned_type_for() to get the
>> unsigned variant.
>>
>> Unfortunately, this ends up in c_common_type_for_size(), which knows nothing
>> about PSImode, and returns an unsigned type with 32 bits of precision when
>> asked for one with 20 bits of precision.
>
> That's expected - this function returns a type that is suitable for holding 
> all
> values, not a type that has necessarily matching precision.  If the caller
> wants such type it needs to verify what the function returned.  Which seems
> to me to be the correct fix for this (premature) optimization in
> convert_to_integer.

Thanks; I'll fix it that way and file a bug report to track it.

Peter


locating unsigned type for non-standard precision

2012-04-24 Thread Peter Bigot
I've run into another issue supporting a 20-bit integer for which I'd
appreciate a hint.  With this code:

  typedef long int __attribute__((__a20__)) int20_t;
  int20_t xi;
  int20_t addit () { xi += 0x54321L; }

xi ends up in mode PSImode, which is a MODE_PARTIAL_INT with 20 bits of
precision and 32-bit width.

convert() notices that, because the constant in the add expression is
SImode, there's an SImode add being truncated to a PSImode result, and
pushes the truncation down into the operands.

The problem is this ends up in convert_to_integer, which detects that the
signed operation might overflow so calls unsigned_type_for() to get the
unsigned variant.

Unfortunately, this ends up in c_common_type_for_size(), which knows nothing
about PSImode, and returns an unsigned type with 32 bits of precision when
asked for one with 20 bits of precision.  The expression is rewritten with
the 32-bit constant integer recast to the 32-bit unsigned type (instead
of the 20-bit one it might have used), and infinite recursion results.

Is it proper for c_common_type_for_size() to know about partial int modes
and return the best one available?  Is there a hook that would allow me to
do this customized in my back-end?  Or is there another way to get
unsigned_type_for() to return the "right" type for an unusual integer
precision?

Thanks.

Peter


Re: copyright assignment request: gcc, binutils, gdb

2012-04-11 Thread Peter Bigot
(Resent to right list.)

I'm maintaining an out-of-tree back-end (Texas Instruments MSP430)
based on years of contributions from a variety of people, affecting
binutils, gcc, and gdb.  Whether it can ever be merged (some of the
original contributors have disappeared), I'd like an assignment form
for what I've done so far and in the future so my own contributions
don't prevent this.  Some of this was personal; some under an
employer; some as a contractor.  Thanks.

Peter


Re: Setting precision for a PSImode type

2012-04-11 Thread Peter Bigot
On Wed, Apr 11, 2012 at 11:55 AM, Bernd Schmidt  wrote:
> On 04/11/2012 06:53 PM, Peter Bigot wrote:
>> On Mon, Mar 5, 2012 at 10:38 AM, Bernd Schmidt  
>> wrote:
>>> On 03/05/2012 05:24 PM, Peter Bigot wrote:
>>>> And is there any reason (other than it doesn't seem to have been done
>>>> before) to believe PSImode is the wrong way to support a
>>>> general-purpose 20-bit integral type in gcc?
>>>
>>> If you're using 4.7.0, it should be possible to use FRACTIONAL_INT_MODE
>>> and get reasonable results. However, it hasn't been tested much, since
>>> the final bits of the patch series which would have added 40 bit int
>>> support to the C frontend didn't make it in. See the discussion following
>>>  http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00079.html
>>
>> I stuck with PARTIAL_INT_MODE instead of FRACTIONAL_INT_MODE because
>> in my case the new type should not appear in the widen/narrow
>> hierarchy of MODE_INT,
>
> May I ask why not? Which problems do you run into?

The biggest one is that widening from 20-bit to 32-bit is an extremely
expensive operation: it was a 16-bit ISA, but some newer MCUs support
an extension with 20 bits in each register and a set of new
instructions that must be used to preserve the upper 4 bits.  Getting
bits 19..16 of a 20-bit register down into the low bits of a 16 bit
register requires a five-position rotate-through-carry.  The 20-bit
enhancement to the ISA was really intended only to support a larger
address space; to simplify validation of the machine description I've
chosen to allow it to be used for any integer operation, but I have no
reason to think that'll be common.

Narrowing from 32 to 20 is similarly expensive.

(It may be that the problems were simply that, because those
operations are so complex, my initial implementation doesn't support
them, and MODE_INT caused them to be generated too many times to make
progress.)

>> I've filed two bug reports for cases where BITSIZE needs to be updated
>> to PRECISION.
>
> Patches are best submitted to gcc-patches directly. Do you have a
> copyright assignment?

Not yet, though these are trivial.  I just started that process in a
separate email.

Peter

>
> Bernd


Re: Setting precision for a PSImode type

2012-04-11 Thread Peter Bigot
On Mon, Mar 5, 2012 at 10:38 AM, Bernd Schmidt  wrote:
> On 03/05/2012 05:24 PM, Peter Bigot wrote:
>> And is there any reason (other than it doesn't seem to have been done
>> before) to believe PSImode is the wrong way to support a
>> general-purpose 20-bit integral type in gcc?
>
> If you're using 4.7.0, it should be possible to use FRACTIONAL_INT_MODE
> and get reasonable results. However, it hasn't been tested much, since
> the final bits of the patch series which would have added 40 bit int
> support to the C frontend didn't make it in. See the discussion following
>  http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00079.html

I stuck with PARTIAL_INT_MODE instead of FRACTIONAL_INT_MODE because
in my case the new type should not appear in the widen/narrow
hierarchy of MODE_INT, but the patches you added for this were
critical to the success I've had:  4.7.0 is working very well for me.

I've filed two bug reports for cases where BITSIZE needs to be updated
to PRECISION.  Since my back-end is out of tree I can't provide test
cases, but if you could do so, or could apply the fixes as obvious, it
might help future developers avoid the problems I found.  Thanks.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52919

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52856

Peter


Re: RFC: -Wall by default

2012-04-11 Thread Peter Bigot
On Wed, Apr 11, 2012 at 11:18 AM, Ian Lance Taylor  wrote:
> Andrew Haley  writes:
>
>> On 04/05/2012 01:28 PM, Michael Veksler wrote:
>>
>>> As for specific warnings, I hate that the the code (a&&b || c&&d),
>>> which did not cause a warning on older gcc version now gives a
>>> warning. I would not want it on by default since it forces users to
>>> write too many parentheses in ((a&&b)||(c&&d)) which makes the
>>> expression less readable. What next? Warn about (a*b+c*d) ? I would
>>> hate to write ((a*b)+(c*d)).
>>>
>>> Sure, the precedence of: << vs. +; & vs. == ; & vs. && ; is less
>>> clear and deserves a warning, but: & vs. | ; && vs. || ; are at
>>> least as common as * vs. + that programmers ought to know them.
>>
>> Absolutely so, yes.  I'd consider this a bug in gcc, just as if it
>> warned for arithmetic.
>
> This one is an interesting case, since there are strong arguments on
> both sides.
>
> I enabled the C++ warning about the precedence of && and || (it's been
> in C for many years).  It found real bugs in real code, bugs that had
> existed for years.

Indeed; although I follow GCC conventions while writing code for integration
into GCC, in my own code I'd much rather write ((a*b)+(c*d)) and deal with
any readability issues by adding whitespace.

Making the intent explicit seems safer than making assumptions about the
expertise level of the people who will be reading and maintaining the code,
or adding to the cognitive load of a maintainer who does remember the
precedence rules.

(Not even hinting for a change in GCC style: it's your bikeshed, I just
visit once in a while.)

Peter


Re: Setting precision for a PSImode type

2012-03-26 Thread Peter Bigot
On Mon, Mar 5, 2012 at 10:38 AM, Bernd Schmidt  wrote:
> On 03/05/2012 05:24 PM, Peter Bigot wrote:
>> And is there any reason (other than it doesn't seem to have been done
>> before) to believe PSImode is the wrong way to support a
>> general-purpose 20-bit integral type in gcc?
>
> If you're using 4.7.0, it should be possible to use FRACTIONAL_INT_MODE
> and get reasonable results. However, it hasn't been tested much, since
> the final bits of the patch series which would have added 40 bit int
> support to the C frontend didn't make it in. See the discussion following
>  http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00079.html

Thanks; I've updated to 4.7.0.  In this thread I found:

>On 07/01/11 23:18, Bernd Schmidt wrote:
>>> What is the function of having both PARTIAL_INT_MODE and
>>> FRACTIONAL_INT_MODE?
>>
>> Not having to change all the targets using PARTIAL_INT_MODE immediately
>> to use the better mechanism.
>
> Also, come to think of it, preventing the rest of the compiler from
> trying to use such a mode in case the target only supports some very
> specific operations on it. A port could choose to use PImode, defined in
> machmode.def (and get __int40_t support), or it could add its own
> private PDImode to use in specific situations only.

Another major difference seems to be that PARTIAL_INT_MODE places the
type in MODE_PARTIAL_INT, while FRACTIONAL_INT_MODE places it in
MODE_INT, and allows it to participate in things like
GET_MODE_WIDER_MODE regardless of whether the fractional mode is
actually available on the target.

I'd really wanted to be able to make general-use 20-bit integers
available to the programmer, but it's looking ugly.  Does it still
make sense to use FRACTIONAL_INT_MODE on a target where there's
essentially an (incomplete) superset of instructions that operate on
the standard registers with 20-bits instead of their original 16-bits?
 Those instructions are only available on a subset of the target
chips, and are enabled by a target-specific user option.  genmodes
doesn't seem to support that constraint.

Peter


Re: Setting precision for a PSImode type

2012-03-05 Thread Peter Bigot
On Mon, Mar 5, 2012 at 10:38 AM, Bernd Schmidt  wrote:
> On 03/05/2012 05:24 PM, Peter Bigot wrote:
>> And is there any reason (other than it doesn't seem to have been done
>> before) to believe PSImode is the wrong way to support a
>> general-purpose 20-bit integral type in gcc?
>
> If you're using 4.7.0, it should be possible to use FRACTIONAL_INT_MODE
> and get reasonable results. However, it hasn't been tested much, since
> the final bits of the patch series which would have added 40 bit int
> support to the C frontend didn't make it in. See the discussion following
>  http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00079.html

Thanks for the reference; it very much seems what I'm trying to do
will be breaking new ground.  I'm still using 4.6.x, though now that
4.7.x has forked off trunk I may update if that would make the task
easier.  My preference would be to stick with PSImode, as I think
support for it in the machine description infrastructure will be
necessary (there are a large number of opcodes specific to operation
on 20-bit values).

Peter


Setting precision for a PSImode type

2012-03-05 Thread Peter Bigot
I'm enhancing the out-of-tree msp430 back end to support the 20-bit
extended registers of the 430X CPU.  I'd like to do this using a
native type rather than special-casing its use in addresses.  It seems
the best approach is to use PSImode, since the values are 20 bits when
in a register, and 32 bits when written to memory---exactly the sort
of thing PSImode appears designed to handle.

However, it seems that the implementation isn't complete: few back
ends use PSImode, and AFAICT none support a general purpose integral
builtin type .

The first problem I'm running into is that genmodes.c's
PARTIAL_INT_MODE doesn't permit me to set the bit size---it's locked
at 32 inherited from SImode.  Consequently any attempt to use
GET_MODE_BITSIZE (PSImode) returns the wrong answer, which causes
infinite recursion in convert_to_integer because the builtin type I
had to register so type_for_mode works uses 20 bits.

Is there a better way to support this sort of thing than extending
genmode with something like a PARTIAL_INT_MODE_PRECISION(M,B) macro
that will correctly set the underlying precision field to 20 instead
-1?  Is there a preferred spelling for that macro?

And is there any reason (other than it doesn't seem to have been done
before) to believe PSImode is the wrong way to support a
general-purpose 20-bit integral type in gcc?

Thanks.

Peter


Re: volatile correctness: combine vs. target.md

2011-12-02 Thread Peter Bigot
On Fri, Dec 2, 2011 at 9:26 AM, Dave Korn  wrote:
> On 01/12/2011 21:40, Georg-Johann Lay wrote:
>
>> It's not unusual because:
>>
>> * It's not unusual to write down SFRs as violatile memory like
>>   (*((volatile unsigned int*) 0x1234)) or as a cast to a composite
>>   that reflects the SFRs bit(field)s.
>>
>> * It's not unusual that microcontrollers can access specific parts
>>   of memory -- in particular I/O -- with special instructions.
>
>  Great.  Now I'm going to be hearing Tom Jones in my head for the rest of the
> day!  :)
>
> (ObTopic: Andrew H's definition of volatile is completely right AIUI and the
> case reported by Peter B is definitely a very bad bug.  Volatile operations
> are externally visible and must be executed in source-code order.)

My case was probably a flaw in my machine description.  Like many
microcontrollers (including AVR), the msp430 maps peripherals as
volatile memory references, and supports instructions that operate
directly on that memory.  The original msp430 back-end did a dirty
trick with special predicates that temporarily set the global
"volatile_ok" flag in order to trick gcc into generating acceptable
code (combining a sequence of load/modify/store insns into a single
RMW insn); I found that to be untrustworthy and replaced it with
peepholes.

Richard's comment that moving unrelated volatiles around each other is
acceptable to core gcc, if not C, may have enabled the issue I saw,
but as far as I'm concerned it was a bug in my stuff, even if I was
confused enough to think for a while it might be technically allowed.

Peter

>
>    cheers,
>      DaveK
>


Re: volatile correctness: combine vs. target.md

2011-12-02 Thread Peter Bigot
On Fri, Dec 2, 2011 at 5:46 AM,  wrote:
>
> ...
> >> It's never correct to exchange volatile accesses.
> >
> >That's not true.  volatile accesses to different memory locations
> >have no special dependence.  If it happens that GCC doesn't
> >do this kind of things then this is only because most passes
> >don't thouch volatile stmts at all (thus the reports for sub-optimal
> >code with volatile - we don't even try to do legal transformations).
>
> I'm confused.  Do you mean that in
>        Volatile int *a, *b;
>        Int j, k;
>        J = *a; k = *b;
> it is ok to fetch *b before *a?  I can't think of any device driver writer 
> who would expect that.

While I don't have the reproducing case to hand, I did find a
situation where something like:

  volatile unsigned int counter; // tied to system clock
  volatile unsigned int ioport;
  t0 = counter;
  // stuff with ioport
  t1 = counter;

ended up being:

  //stuff with ioport
  t0 = counter;
  t1 = counter;

This might have been due to misbehavior of a machine role in my
target, but at the time I convinced myself that technically the move
was legal even though it completely invalidated the timing I was
trying to do.

Peter


Re: approaches to carry-flag modelling in RTL

2011-10-29 Thread Peter Bigot
On Sat, Oct 29, 2011 at 10:58 AM, Richard Henderson  wrote:
> On 10/29/2011 05:41 AM, Peter Bigot wrote:
>> It seems cc0 should probably still be preferred for CISC-style
>> architectures like the MSP430.  I'll give that approach a try.
>
> I think that's somewhat unfair.  Take a close look at the RX and
> mn10300 ports -- they're what I would call the most up-to-date
> of the cisc-y ports.

Both RX and mn10300 may be cleaner architectures than MSP430, in that the
machine descriptions only provide SI and DI versions of the insns.  I admit
I'm discouraged that it's necessary to have (1) an addsi3 insn, (2) an
*addsi3_flags insn that differs from it only by adding a CC_REG set in the
RTL template and a reload_completed condition, and (3) an addsi3_flags
expander that replicates the RTL template of *addsi3_flags.

For MSP430, the native operand sizes are QI, HI, and PSI, and I need all
three, plus expanders for the derived SI and DI variants where gcc's
ignorance of machine-dependent carry support causes the default expansions
to be poor.  The same sort of relationships between the instructions and the
flags register exist for add, sub, ior, and, xor, and left and right shift
operations.  Even using iterators, this makes for an ugly machine
description.

The last straw for me is that adding CC_REG-related patterns into the RTL
bleeds into the peephole definitions that I need to eliminate unnecessary
(arguably wrong) register moves that get generated for RMW operations on
memory-mapped peripherals.  Those are already pretty hard to understand.

Using CC_MODE in this case certainly appears much more complicated
than using cc0.  Do you believe the effort required to get it right would be
justified?  Where would the win come from?

Peter


Re: approaches to carry-flag modelling in RTL

2011-10-29 Thread Peter Bigot
On Fri, Oct 28, 2011 at 11:59 AM, Richard Henderson  wrote:
> On 10/28/2011 06:49 AM, Peter Bigot wrote:
>> I'm inclined to follow sparc's lead, but is one or another of the choices
>> more likely to help combine/reload/etc do a better job?
>
> I don't know.
>
> In the case of RX, we don't model CC_REG until after reload, so combine really
> doesn't get a shot at it.
>
> Be careful here.  If you explicitly model the carry flag before reload, you 
> need
> to have an ADD instruction that can avoid any flags modification.  Reload 
> needs
> to generate such instructions in some cases, and needs to be able to insert 
> them
> between any two arbitrary insns.
>
> If you're like sparc (separate add, addcc insns), or i386 (separate add, lea 
> insns),
> then you're fine.  If you're like m68k or RX and have only an add that 
> clobbers
> the flags, then you'll have to delay splitting flags-using patterns until 
> after
> reload is complete.

Interesting.  The MSP430 ISA has add and addc insns, but both affect the
flags.  Most instructions affect the flags.

Based on what I've encountered so far, between having to duplicate many
insns (one with CC_REG, one without), adding splits to convert between them,
and making a hash of the templates for the peepholes that enable efficient
access to volatile RMW peripheral memory, it looks like using CC_MODE is
going to create a maintenance nightmare.

I'd been moving on this path based on the recommendation in the "Condition Code
Status" section of GCC Internals to use CC_MODE for new ports.  (Though the
MSP430 isn't a "new port", in 2003 the original cc0 solution using
NOTICE_UPDATE_CC was converted to something obscene that ends up instead
walking the insn chain calling get_attr_cc during output template generation
to identify the most recent change.  No idea why; the CVS comment just
says "cmp fixes and improvements".)

It seems cc0 should probably still be preferred for CISC-style
architectures like the
MSP430.  I'll give that approach a try.

Thanks.

Peter


approaches to carry-flag modelling in RTL

2011-10-28 Thread Peter Bigot
I'm rewriting a back-end originally based on AVR to eliminate insns for
multi-word operations (output templates like "add\;addc") and to use MODE_CC
instead of an unusual attribute-based approach.  The motivation is that I've
mostly found gcc does a better job than the existing back-end if it's shown
what's actually going on.

Part of this update requires correctly modelling the carry flag, for
plus/minus and for rotate through carry.  As noted in recent email here,
preserving the correct instruction order when expanding multi-word expressions
requires a set/use relation between the word-mode insns rather than a
simple clobber/use relation.

I've found several examples where back-ends model the carry in RTL.

Sparc does:

(plus:SI
  (plus:SI
(match_operand:SI 1 "arith_operand" "%r")
(match_operand:SI 2 "arith_operand" "rI"))
  (ltu:SI (reg:CC_NOOV 100) (const_int 0]

RX does:

(plus:SI
  (plus:SI
(ltu:SI (reg:CC CC_REG) (const_int 0))
(match_operand:SI 1 "register_operand"  "%0,0,0,0,0,0"))
  (match_operand:SI   2 "rx_source_operand"
"r,Sint08,Sint16,Sint24,i,Q")))

stormy16 does:

(plus:HI
  (plus:HI
(match_operand:HI 1 "register_operand" "%0,0,0")
(zero_extend:HI (reg:BI CARRY_REG)))
  (match_operand:HI 2 "xs_hi_nonmemory_operand" "L,Ir,i")))

The variation points are:

(a) where the carry operand appears in the plus expressions;

(b) whether it's expressed as an ltu zero comparison or a zero_extend.

I'm inclined to follow sparc's lead, but is one or another of the choices
more likely to help combine/reload/etc do a better job?

Thanks.

Peter


Re: Expanding instructions with condition codes inter-deps

2011-10-21 Thread Peter Bigot
On Fri, Oct 21, 2011 at 4:41 PM, Richard Henderson  wrote:
> On 10/21/2011 10:15 AM, Paulo J. Matos wrote:
>> So I have implemented the nadd and addc as:
>>
>> (define_insn "negqi2"
>>   [(set (match_operand:QI 0 "register_operand" "=c")
>>         (neg:QI (match_operand:QI 1 "register_operand" "0")))
>>    (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0)))
>>    (clobber (reg:CC RCC))]
>>   ""
>> {
>>     operands[2] = const0_rtx;
>>     return  "nadd\\t%0,%2";
>> })
>
> There are lots of parts of the compiler that don't optimize well when an
> insn has more than one output.  For the normal insn, just clobber the flags;
> don't include a second SET.

In my similar case, I'm implementing a multi-word rotate by using an
arithmetic shift that overflows into carry, followed by a second rotate
that shifts in from carry for the upper word:

(define_insn "rla1"
  [(set (match_operand:INTRegModes 0 "nonimmediate_operand" "+rm")
(ashift: (match_dup 0) (const_int 1)))
   (set (reg:CC_C REGNO_SR) (unspec:CC_C [(match_dup 0)] UNSPEC_ROTC))]
  ""
  { return msp430_output_template (mode, 0, "rla", NULL); }
  )

(define_insn "rlc1"
  [(set (match_operand:INTRegModes 0 "nonimmediate_operand" "+rm")
(unspec: [(match_dup 0)] UNSPEC_RLC))
   (use (reg:CC_C REGNO_SR))
   (set (reg:CC_C REGNO_SR) (unspec:CC_C [(match_dup 0)] UNSPEC_ROTC))]
  ""
  { return msp430_output_template (mode, 0, "rlc", NULL); }
  )

I found that if I used (clobber (reg:CC_C REGNO_SR)) instead of set that the
dependency was not preserved, and the rlc ended up after the rla in the
final instruction stream.  This is consistent with your previous comment in
this thread:

> The thing that's almost certainly missing is that the NAND pattern
> must SET your flags register, not simply clobber it.  Otherwise the
> dependency between the ADDC and the NAND will never be created properly.

Is there a better way to solve this that doesn't introduce multiple outputs in
one insn?

Peter

>
>> (define_insn "addc_internal"
>>   [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
>>         (plus:QI
>>           (plus:QI
>>             (ltu:QI (reg:CC RCC) (const_int 0))
>>             (match_operand:QI 1 "nonimmediate_operand" "%0"))
>>           (match_operand:QI 2 "general_operand" "cwmi")))
>>    (use (reg:CC_C RCC))
>>    (clobber (reg:CC RCC))]
>>   ""
>>   "addc\\t%0,%f2")
>
> You don't need the USE, because you mention RCC inside the LTU.
>
>> (define_insn "*addc_internal_flags"
>
> Likewise.
>
>> A couple of things to note:
>> * negqi (which generates the nadd x, y equivalent to -x + y) has a
>> set RCC in C mode followed by a clobber. The set in C mode doesn't
>> show up in the _flags variant which is used only for the compare-elim
>> since it doesn't really matter and it already contains a set RCC
>> anyway.
>
> Surely the NADD insn is simply a normal subtract (with reversed operands).
> You shouldn't *need* to implement NEG at all, as the middle-end will let
> NEG expand via MINUS.
>
> Just so you know...
>
>> * is this enough for GCC to understand that anything that clobbers
>> RCC or specifically touches the RCC in C mode shouldn't go in between
>> these two instructions?
>
> Yes.
>
>> Also, do I need to specify in the RCC
>> clobber, exactly which flags are clobbered, or should I use a set
>> instead?
>
> No, the compiler will assume the entire register is changed, no matter
> what CCmode you place there.
>
>> * in the case of using sets, it was easy in the case of the negqi of
>> findind the source of the set RCC, however, it's not so easy for the
>> general case. Is unspec the answer? Is unspec the way of saying:
>> "hey, I am setting RCC in Cmode here, you shouldn't bother about the
>> value that I put there. Just know that RCC is going to be set."
>
> You can often just use (compare x y) as well, assuming that the flags
> are set "appropriately".  GCC doesn't assume anything about the
> contents of the intermediate CCmode object, but does assume that
>
>  (lt (compare x y) (const_int 0))
>
> produces the same value as
>
>  (lt x y)
>
> But, yes, if there's no obvious comparison, then unspec is ok.
>
>
> r~
>


Re: IRA misses register range overlap

2011-09-16 Thread Peter Bigot
On Thu, Sep 15, 2011 at 10:34 AM, Vladimir Makarov  wrote:
> On 09/15/2011 11:16 AM, Peter Bigot wrote:
>>
>> In the msp430 back end, hard registers 4 through 15 are HImode, with
>> adjacent register sequences used for SImode and DImode.  In preparation
>> for
>> a library call, I'm emitting RTL that assigns values directly to reg:SI 4.
>>
>> Despite that, in gcc 4.5.x IRA choses reg:HI 4 as the destination
>> for a pseudo-register for a preceding assignment, and does nothing to
>> preserve the value over the span where the register is part of an SI
>> value.
>> The subsequence:
>>
>>   (insn 2 4 3 2 (set (reg/v:HI 38 [ x ])
>>           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
>>        (expr_list:REG_DEAD (reg:HI 15 r15 [ x ])
>>           (nil)))
>>
>>   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>
>>   (note 6 3 10 2 NOTE_INSN_DELETED)
>>
>>   (insn 10 6 11 2 (set (reg:SI 8 r8)
>>           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>>   (insn 11 10 12 2 (set (reg:SI 4 r4)
>>           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>> with:
>>
>>   insn=2, live_throughout: 1, dead_or_set: 15, 38
>>   insn=10, live_throughout: 1, 38, dead_or_set: 8, 9
>>   insn=11, live_throughout: 1, 8, 9, 38, dead_or_set: 4, 5
>>   insn=12, live_throughout: 1, 38, dead_or_set: 4, 5, 6, 7, 8, 9, 10,
>> 11, 12, 13, 14, 15
>>
>> becomes:
>>
>>   (insn 2 4 3 2 (set (reg/v:HI 4 r4 [orig:38 x ] [38])
>>           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
>>        (nil))
>>
>>   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>
>>   (note 6 3 10 2 NOTE_INSN_DELETED)
>>
>>   (insn 10 6 11 2 (set (reg:SI 8 r8)
>>           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>>   (insn 11 10 12 2 (set (reg:SI 4 r4)
>>           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>> and the subsequent reference to reg:HI 4 (formerly reg/v:HI 38) has value
>> 33614 instead of the user's parameter.
>>
>> Could somebody suggest where should I look to understand why this is
>> happening and how should it be fixed?
>>
> The best way is to file a bug to http://gcc.gnu.org/bugzilla/.  You should
> submit the test (the smaller the test, the better) and how to reproduce it:
> how to build gcc (configure options) and how to call the built gcc to
> reproduce results.
>
> I think you could look at ira dump file and check that allocno for p38
> conflicting with hard reg 4 *and* 5.  If it is not, the problem is in
> conflict calculation.  Otherwise, it might be IRA hard register assignment
> or in reload (the worst case).

My investigation leads me to believe that IRA does fail to detect the
conflicts, as you suggested.  I'm generating RTL that uses specific general
registers that are not distinguished by register classes or marked in the
insn constraints.  This is to prepare for calls to support routines that
don't follow the standard ABI.  (Perhaps this is the wrong way to solve
that problem, though I believe AVR and perhaps other platforms do this
as well.)

I've attached to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50427 a patch
that augments ira-lives.c (process_single_reg_class_operands) to detect this
and mark the corresponding registers as conflicts.  It works on this case, and
doesn't create any problems that are obvious to me.

I'd appreciate a review of whether this approach is right, or if the conflicts
should be recorded somewhere else.

Peter

>
> But having only info you provided it is very hard to say what is wrong.
>
>


Re: IRA misses register range overlap

2011-09-15 Thread Peter Bigot
On Thu, Sep 15, 2011 at 4:09 PM, Vladimir Makarov  wrote:
> On 09/15/2011 03:06 PM, Peter Bigot wrote:
>>
>> On Thu, Sep 15, 2011 at 10:34 AM, Vladimir Makarov
>>  wrote:
>>>
>>> On 09/15/2011 11:16 AM, Peter Bigot wrote:
>>>>
>>>> In the msp430 back end, hard registers 4 through 15 are HImode, with
>>>> adjacent register sequences used for SImode and DImode.  In preparation
>>>> for
>>>> a library call, I'm emitting RTL that assigns values directly to reg:SI
>>>> 4.
>>>>
>>>> Despite that, in gcc 4.5.x IRA choses reg:HI 4 as the destination
>>>> for a pseudo-register for a preceding assignment, and does nothing to
>>>> preserve the value over the span where the register is part of an SI
>>>> value.
>>>> The subsequence:
>>>>
>>>>   (insn 2 4 3 2 (set (reg/v:HI 38 [ x ])
>>>>           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
>>>>        (expr_list:REG_DEAD (reg:HI 15 r15 [ x ])
>>>>           (nil)))
>>>>
>>>>   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>>>
>>>>   (note 6 3 10 2 NOTE_INSN_DELETED)
>>>>
>>>>   (insn 10 6 11 2 (set (reg:SI 8 r8)
>>>>           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]>>> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
>>>>        (nil))
>>>>
>>>>   (insn 11 10 12 2 (set (reg:SI 4 r4)
>>>>           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
>>>>        (nil))
>>>>
>>>> with:
>>>>
>>>>   insn=2, live_throughout: 1, dead_or_set: 15, 38
>>>>   insn=10, live_throughout: 1, 38, dead_or_set: 8, 9
>>>>   insn=11, live_throughout: 1, 8, 9, 38, dead_or_set: 4, 5
>>>>   insn=12, live_throughout: 1, 38, dead_or_set: 4, 5, 6, 7, 8, 9, 10,
>>>> 11, 12, 13, 14, 15
>>>>
>>>> becomes:
>>>>
>>>>   (insn 2 4 3 2 (set (reg/v:HI 4 r4 [orig:38 x ] [38])
>>>>           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
>>>>        (nil))
>>>>
>>>>   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>>>
>>>>   (note 6 3 10 2 NOTE_INSN_DELETED)
>>>>
>>>>   (insn 10 6 11 2 (set (reg:SI 8 r8)
>>>>           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]>>> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
>>>>        (nil))
>>>>
>>>>   (insn 11 10 12 2 (set (reg:SI 4 r4)
>>>>           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
>>>>        (nil))
>>>>
>>>> and the subsequent reference to reg:HI 4 (formerly reg/v:HI 38) has
>>>> value
>>>> 33614 instead of the user's parameter.
>>>>
>>>> Could somebody suggest where should I look to understand why this is
>>>> happening and how should it be fixed?
>>>>
>>> The best way is to file a bug to http://gcc.gnu.org/bugzilla/.  You
>>> should
>>> submit the test (the smaller the test, the better) and how to reproduce
>>> it:
>>> how to build gcc (configure options) and how to call the built gcc to
>>> reproduce results.
>>
>> Unfortunately, the former msp430 maintainers never pushed the back-end
>> upstream, so filing a bug on a target that isn't part of gcc is
>> unlikely to get much attention.  It's also pretty specific to the
>> machine description, so I doubt it could be reproduced on another
>> target.
>>
>> I was hoping for more of a "yes, that happens if you don't [missed
>> back-end requirement here]", or even a "no, that shouldn't be
>> happening".
>>
> It should not be happening.  It is a bug.  It should be fixed in RA (IRA or
> reload).  IRA/reload works for many targets where the same situation occurs.
>  So it is hard to say what is wrong without more info.

Based on what you've said I've provided source and the before/after
IRA dump files in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50427.
I'll continue to dig into this; suggestions welcome.

Peter

> Although RA is directed by many machine-dependent macros and one macro might
> return a wrong value (e.g.  number registers needed to hold value of a
> mode).  But it is less probable.


Re: IRA misses register range overlap

2011-09-15 Thread Peter Bigot
On Thu, Sep 15, 2011 at 10:34 AM, Vladimir Makarov  wrote:
> On 09/15/2011 11:16 AM, Peter Bigot wrote:
>>
>> In the msp430 back end, hard registers 4 through 15 are HImode, with
>> adjacent register sequences used for SImode and DImode.  In preparation
>> for
>> a library call, I'm emitting RTL that assigns values directly to reg:SI 4.
>>
>> Despite that, in gcc 4.5.x IRA choses reg:HI 4 as the destination
>> for a pseudo-register for a preceding assignment, and does nothing to
>> preserve the value over the span where the register is part of an SI
>> value.
>> The subsequence:
>>
>>   (insn 2 4 3 2 (set (reg/v:HI 38 [ x ])
>>           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
>>        (expr_list:REG_DEAD (reg:HI 15 r15 [ x ])
>>           (nil)))
>>
>>   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>
>>   (note 6 3 10 2 NOTE_INSN_DELETED)
>>
>>   (insn 10 6 11 2 (set (reg:SI 8 r8)
>>           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>>   (insn 11 10 12 2 (set (reg:SI 4 r4)
>>           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>> with:
>>
>>   insn=2, live_throughout: 1, dead_or_set: 15, 38
>>   insn=10, live_throughout: 1, 38, dead_or_set: 8, 9
>>   insn=11, live_throughout: 1, 8, 9, 38, dead_or_set: 4, 5
>>   insn=12, live_throughout: 1, 38, dead_or_set: 4, 5, 6, 7, 8, 9, 10,
>> 11, 12, 13, 14, 15
>>
>> becomes:
>>
>>   (insn 2 4 3 2 (set (reg/v:HI 4 r4 [orig:38 x ] [38])
>>           (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
>>        (nil))
>>
>>   (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>
>>   (note 6 3 10 2 NOTE_INSN_DELETED)
>>
>>   (insn 10 6 11 2 (set (reg:SI 8 r8)
>>           (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]> 0x7f032064f960 seed>) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>>   (insn 11 10 12 2 (set (reg:SI 4 r4)
>>           (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
>>        (nil))
>>
>> and the subsequent reference to reg:HI 4 (formerly reg/v:HI 38) has value
>> 33614 instead of the user's parameter.
>>
>> Could somebody suggest where should I look to understand why this is
>> happening and how should it be fixed?
>>
> The best way is to file a bug to http://gcc.gnu.org/bugzilla/.  You should
> submit the test (the smaller the test, the better) and how to reproduce it:
> how to build gcc (configure options) and how to call the built gcc to
> reproduce results.

Unfortunately, the former msp430 maintainers never pushed the back-end
upstream, so filing a bug on a target that isn't part of gcc is
unlikely to get much attention.  It's also pretty specific to the
machine description, so I doubt it could be reproduced on another
target.

I was hoping for more of a "yes, that happens if you don't [missed
back-end requirement here]", or even a "no, that shouldn't be
happening".

It looks almost like the fact that I'm generating RTL that references
the hard registers directly is ignored by IRA for conflict resolution,
which seems to only occur among the registers that it's responsible
for assigning.  I'll look again through the docs to see if there's
some hints that I'm missing a step.

If anybody else has further suggestions or insights I'd appreciate
them.  Thanks.

Peter

> I think you could look at ira dump file and check that allocno for p38
> conflicting with hard reg 4 *and* 5.  If it is not, the problem is in
> conflict calculation.  Otherwise, it might be IRA hard register assignment
> or in reload (the worst case).
>
> But having only info you provided it is very hard to say what is wrong.
>
>


IRA misses register range overlap

2011-09-15 Thread Peter Bigot
In the msp430 back end, hard registers 4 through 15 are HImode, with
adjacent register sequences used for SImode and DImode.  In preparation for
a library call, I'm emitting RTL that assigns values directly to reg:SI 4.

Despite that, in gcc 4.5.x IRA choses reg:HI 4 as the destination
for a pseudo-register for a preceding assignment, and does nothing to
preserve the value over the span where the register is part of an SI value.
The subsequence:

  (insn 2 4 3 2 (set (reg/v:HI 38 [ x ])
  (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
   (expr_list:REG_DEAD (reg:HI 15 r15 [ x ])
  (nil)))

  (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

  (note 6 3 10 2 NOTE_INSN_DELETED)

  (insn 10 6 11 2 (set (reg:SI 8 r8)
  (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]  ) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
   (nil))

  (insn 11 10 12 2 (set (reg:SI 4 r4)
  (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
   (nil))

with:

  insn=2, live_throughout: 1, dead_or_set: 15, 38
  insn=10, live_throughout: 1, 38, dead_or_set: 8, 9
  insn=11, live_throughout: 1, 8, 9, 38, dead_or_set: 4, 5
  insn=12, live_throughout: 1, 38, dead_or_set: 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15

becomes:

  (insn 2 4 3 2 (set (reg/v:HI 4 r4 [orig:38 x ] [38])
  (reg:HI 15 r15 [ x ])) test.c:28 21 {*movhi2}
   (nil))

  (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)

  (note 6 3 10 2 NOTE_INSN_DELETED)

  (insn 10 6 11 2 (set (reg:SI 8 r8)
  (mem/c/i:SI (symbol_ref:HI ("seed") [flags 0x2]  ) [2 seed+0 S4 A16])) test.c:14 24 {*movsi2}
   (nil))

  (insn 11 10 12 2 (set (reg:SI 4 r4)
  (const_int 33614 [0x834e])) test.c:14 24 {*movsi2}
   (nil))

and the subsequent reference to reg:HI 4 (formerly reg/v:HI 38) has value
33614 instead of the user's parameter.

Could somebody suggest where should I look to understand why this is
happening and how should it be fixed?

Thanks.

Peter


Re: Just what are rtx costs?

2011-08-22 Thread Peter Bigot
On Sun, Aug 21, 2011 at 12:01 PM, Georg-Johann Lay  wrote:
>
> Richard Sandiford schrieb:
>>
>> Georg-Johann Lay  writes:
>>
>>> Richard Sandiford schrieb:
>>>
 I've been working on some patches to make insn_rtx_cost take account
 of the cost of SET_DESTs as well as SET_SRCs.  But I'm slowly beginning
 to realise that I don't understand what rtx costs are supposed to 
 represent.

 AIUI the rules have historically been:

  1) Registers have zero cost.

  2) Constants have a cost relative to that of registers.  By extension,
    constants have zero cost if they are as cheap as a register.

  3) With an outer code of SET, actual operations have the cost
    of the associated instruction.  E.g. the cost of a PLUS
    is the cost of an addition instruction.

  4) With other outer codes, actual operations have the cost
    of the combined instruction, if available, or the cost of
    a separate instruction otherwise.  E.g. the cost of a NEG
    inside an AND might be zero on targets that support BIC-like
    instructions, and COSTS_N_INSNS (1) on most others.

 [...]

 But that hardly seems clean either.  Perhaps we should instead make
 the SET_SRC always include the cost of the SET, even for registers,
 constants and the like.  Thoughts?
>>>
>>> IMO a clean approach would be to query the costs of a whole insn (resp. 
>>> it's pattern) rather than the cost of an RTX.  COSTS_N_INSNS already 
>>> indicates that the costs are compared to *insn* costs i.e. cost of the 
>>> whole pattern (modulo clobbers).
>>
>> The problem is that we sometimes want the cost of something that cannot
>> be done using a single instruction.  E.g. some CONST_INTs take several
>> instructions to create on MIPS.  In this case the costs are really
>> measuring the cost of an emit_move_insn sequence, not a single insn.
>>
>> I suppose we could use emit_move_insn to create a temporary sequence
>> and sum the cost of each individual instruction.  But that's potentially
>> expensive.
>
> No, that complexity is not needed.  For (set (reg) (const_int)) the BE can 
> just return the cost of the expanded sequence because it knows how it will be 
> expanded and how much it will cost.  There's no need to really expand the 
> sequence.
>
> That's the way, e.g. AVR backend works: Shifts/mul/div must be expanded 
> because the hardware does not support them natively.  The rtx_cost for such 
> an expression (which are always interpreted as RHS of a (set (reg) ...)) are 
> the sum over the costs of all insns the expander will produce.

One of my problems with this approach is that the logic that's put
into an expander definition preparation statement (or, in the case of
AVR, the function invoked by the insn output statement) gets
replicated abstractly in rtx_costs: both places have long switch
statements on operand mode and const shift value to determine the
instructions that get emitted (in the former) or how many of them
there are (in the latter).  How likely is it the two are kept
consistent over the years?

I'm working on the (not yet pushed upstream) back-end for the TI
MSP430, which has some historical relationship to AVR from about a
decade ago, and the answer to that question is "not very likely".
I've changed the msp430 back-end so that instead of putting all that
logic in the output statement for the insn, it goes into a preparation
statement for a standard expander.  This way the individual insns that
result in (say) a constant shift of 8 bits using xor and bswap are
available for the optimizer and register allocator to improve.

This works pretty well, but still leaves me with problems when it
comes to computing RTX costs, because there seems to be some strength
reduction optimization for multiplication that's asking for the costs
to shift each integer type by 1 to 15 bits, when in fact no such insn
should ever be produced if real code was being generated.  I think
this is an example of the case Richard's describing.

If, in rtx_costs, I could detect an unexpected insn, deduce the
correct expander function, call it, then recurse on the sequence it
generated, I'd get the right answer---though I'd infinitely prefer not
to be asked to calculate the cost of an unexpected insn.  Doing this
expansion would probably be very expensive, though, and with the side
effects that are part of emit_insn I don't know how to safely call
things that invoke it when what gets emitted isn't part of the actual
stream.

>>
>> Also, any change along these lines is similar to the "tie costs to
>> .md patterns" thing that I mentioned at the end of the message.
>> I don't really have time to work on anything so invasive, so the
>> question is really whether we can sensibly change the costs within
>> the current framework.
>>
>>> E.g. the cost of a CONST_INT is meaningless if you don't know what to do 
>>> with the constant. (set (reg:QI) (const_int 0)) might have 

Re: Re: patch: don't issue -Wreorder warnings when order doesn't matter

2011-07-30 Thread Peter Bigot
2011/7/29 Daniel Marjamäki :
> Hello!
>
>> Why doesn't it matter in this case but it matters when the initializer
>> are non-constant?
>
> It doesn't matter because the program will behave the same no matter
> if the initializations are reordered or not. Logically it will behave
> just as the user expects no matter if he expects reordering or not.
>
> When one initializer is non-constant there might be a dependency
> between the two initializations and the wrong order might be a bug.
> I would like to silence such warnings also, but I don't want to try to
> determine if there is dependencies or not.
>
>
>> If you don't want to fix up your code, just compile it with -Wno-reorder.
>
> I don't want to use -Wno-reorder , because then I won't see the real
> problems. Don't get me wrong, I like this check.
>
>
> When gcc generates noise I think it is better to fix gcc than to fix my code.
>
> The only reason I can think of to keep this noise is for purely
> stylistic reasons. Somebody might think it is a stylistic problem to
> initialize members backwards. But then -Wreorder should also warn
> about common assignments and I doubt anybody wants that.

If the initializer is constant, but the member value that's being
initialized has a
non-trivial constructor with a side effect, your patch will inhibit the warning
but the program will not behave the same as if reordering had not happened.

Peter


Re: [Bug driver/49858] lost ability to use driver for external language support?

2011-07-26 Thread Peter Bigot
On Tue, Jul 26, 2011 at 6:31 PM, joseph at codesourcery dot com
 wrote:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49858
>
> --- Comment #1 from joseph at codesourcery dot com  dot com> 2011-07-26 23:31:27 UTC ---
> On Tue, 26 Jul 2011, bigotp at acm dot org wrote:
>
> > Is there a mechanism by which the driver can be informed of options that it
> > should allow in this situation, given that the list of these options is not
> > known at the time the driver is compiled?
>
> No.  By design there is now a structured notion of options shared by the
> driver and cc1 and a single mechanism for option parsing, that
> consistently rejects unknown options rather than having them sometimes
> only inconsistently rejected depending on which phases of compilation are
> run, and that is intended in future to support multilib selection based on
> logical option state rather than option text, and maybe eventually
> disallowing options without --help text.
>
> It was a mistake that specs were ever documented in the user manual; they
> should be considered a purely internal interface between different parts
> of GCC.  It is not expected that you can drop in support for new languages
> without at least part of that support being present when GCC is built
> (GNU/Linux distributors may include the minimal set of files defining
> specs and options for languages such as D and Pascal when they build their
> main GCC packages, for example, so that the driver then supports those
> languages even though the language compilers themselves are built
> separately).

On one hand, I think a regression like this warrants discussion, as this
removes a feature that has been supported by gcc (and arguably documented as
supported) for many years.  Worst, the people who depend on it are folks who
are unlikely to be involved in gcc development and will not be aware of the
change until after 4.7.0 comes out and their systems have to be
rearchitected to work with new vendor-provided gcc installations.  I only
found it now because I'm updating an out-of-tree back-end (TI msp430) and
most of my real-world applications are in TinyOS and depend on nesC.

On the other hand, I'm sympathetic to the position that specs are a matter
of internal implementation feature and that the change is reasonable.  If
that's going to be the resolution, though, the spec files material should be
moved to the internals documentation.

Peter

> (I do not rule out that the plugin mechanism could be extended in future
> to allow driver plugins, that might modify the unknown option handling,
> but the change away from allowing arbitrary options that just happen to be
> matched by some spec, to having a structured notion allowing a meaningful
> enumeration of all supported options and their associated properties, is
> deliberate.  It would also be reasonable to provide a way to export the
> properties of GCC's options for use in external wrappers to know what
> options take arguments.  But processing options unknown to GCC is outside
> the intended uses of the driver at present, although you may be able to
> find a hack where your wrapper hides them within options known to GCC that
> take an argument but where the driver doesn't validate that argument in
> any way; -fplugin-arg-* might be the most sensible option to use.)
>
> --
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> --- You are receiving this mail because: ---
> You reported the bug.


libgcc helpers __cmpsi2/__ucmpsi2

2011-04-22 Thread Peter Bigot
I'm updating the machine description for a 16-bit microcontroller, and am
trying to eliminate some hand-coded SImode and DImode insns because in
general gcc generates better code automatically.  So far, I've found no case
where I seem to need to implement cbranchsi4 or cbranchdi4.

However, gcc does generate a reference to __ucmpsi2 when compiling a switch
statement with a default clause and a 32-bit integer as the selection
expression.

The mailing list archives suggest that cmpsi2 was at one point a library
routine, but it is not documented in the internals (at least in the 4.5.x
branch where I'm currently working).  libgcc2 provides a presumably trusted
implementation of __ucmpdi2, though except for a switch statement with a
default clause and a 64-bit selection expression I've been unable to find
any way to generate test code that actually uses it
(gcc.c-torture/execute/cmpdi-1.c doesn't.)  I've never been able to get the
signed version __cmpdi2 to show up.

One past message (http://gcc.gnu.org/ml/gcc/2007-08/msg00225.html) suggests
that a similar evocation of __cmpsi2 might be a bug that arose because there
are "two different sizes of signed integers both of which are larger than
word_mode".  Which is the case in the MSP430.

GCC seems fully capable of reducing SImode and DImode comparisons to HImode
in every other situation.  Is generation of this undocumented helper
function worth treating as a bug in the compiler, or is the documentation
incomplete and back-ends (or libgcc2.c) should be providing an
implementation?  If the former, I'll spend some time trying to isolate it;
if the latter, I won't bother.

Peter


Re: constraints for push byte on word-aligned stack

2011-04-05 Thread Peter Bigot
On Tue, Apr 5, 2011 at 2:35 AM, Andreas Schwab  wrote:
> Peter Bigot  writes:
>
>> I have a target that supports a "push.b x" operation that puts a byte onto
>> the stack but pre-decrements the stack pointer by 2 to maintain alignment.
>
> That looks like the same as what m68k does, see PUSH_ROUNDING.

Thanks.  In my existing machine description, PUSH_ROUNDING is already
set.  m68k is one of the three machine descriptions that explicitly
uses pushm1 instructions rather than mov expanders, regardless of
PUSH_ROUNDING.  Perhaps that's the magic; in fact, pushqi1 seems to be
the only the only variant ever implemented.

It's not that push isn't working, it's that I'm surprised that the
tree-to-rtl translation generates a RTL expression that passes
push_operand() but not the "<" constraint, but only in a case where
PUSH_ROUNDING has an effect.  I'll try to look into pushqi1 sometime,
but for now using =X seems to work fine.

Peter


constraints for push byte on word-aligned stack

2011-04-04 Thread Peter Bigot
I have a target that supports a "push.b x" operation that puts a byte onto
the stack but pre-decrements the stack pointer by 2 to maintain alignment.

I have a machine description that includes these two defines:

(define_insn "*pushqi_pre_mod"
[(set (mem:QI (pre_modify:HI (reg:HI 1)
    (plus:HI (reg:HI 1) (const_int -2
    (match_operand:QI 0 "general_operand" "rim"))]
""
"* return msp430_pushqi(insn, operands);"
[(set_attr "length" "2")])

(define_insn "*pushqi"
  [(set (match_operand:QI 1 "push_operand" "=<")
    (match_operand:QI 0 "general_operand" "rim"))]
  ""
  "* return msp430_pushqi(insn, operands);"
  [(set_attr "length" "2")])

I'd hoped to clean this up by removing the apparently redundant first
definition.

I have a test program which normally works fine, but if I comment out the
first define produces:

movqi.c: In function ‘pushqi_r’:
movqi.c:33:1: error: unable to generate reloads for:
(insn 6 3 7 2 movqi.c:33 (set (mem/i:QI (pre_modify:HI (reg/f:HI 1 r1)
    (plus:HI (reg/f:HI 1 r1)
    (const_int -2 [0xfffe]))) [0 S1 A8])
    (reg:QI 15 r15 [ p ])) 13 {*pushqi} (expr_list:REG_DEAD
(reg:QI 15 r15 [ p ])
    (nil)))
movqi.c:33:1: internal compiler error: in find_reloads, at reload.c:3821

Instrumentation shows that push_operand succeeds, but the '<' constraint
fails, as the mem operand with pre_modify does not pass that test (nor does
it pass 'g' or 'm'; only 'X').

I've discovered HAVE_PRE_MODIFY_DISP which, based on its one-sentence
description, appears to be relevant as I'm changing the stack pointer by 2
though the operand size is 1.  But it seems to have no effect, nor does
HAVE_PRE_MODIFY_REG.

Is there a better alternative to make this work than either leaving the
extra instruction in, or replacing the "=<" constraint with "=X"?  Does the
latter have consequences that I should expect to regret?

Peter