Re: Target attribute hooks questions

2015-05-05 Thread Christian Bruel

Hi Kyrill,

you are right it's not easy to get its way among all those macros, my
main source of inspiration for ARM was the x86 implementation.

You can have a look at the ARM implementation to start with (on
gcc-patches, under review). That would be best not to diverge too much
aarch64 might have a few code to share with the arm be. FYI I'm planning
to add the fpu/neon attribute extensions

A few quick answer bellow, ask if you need more.

Cheers

Christian

On 05/05/2015 03:38 PM, Kyrill Tkachov wrote:
> Hi all,
> 
> I'm looking at implementing target attributes for aarch64 and I have some 
> questions about the hooks involved.
> I haven't looked at this part of the compiler before, so forgive me if some 
> of them seem obvious. I couldn't
> figure it out from the documentation 
> (https://gcc.gnu.org/onlinedocs/gccint/Target-Attributes.html#Target-Attributes)
> 
> * Seems to me that TARGET_OPTION_VALID_ATTRIBUTE_P is the most important one 
> that parses
> the string inside the __attribute__ ((target ("..."))) and sets the 
> target-specific
> flags appropriately. Is that correct?

Yes, it parses the string that goes into DECL_FUNCTION_SPECIFIC_TARGET
(fndecl) and then builds the struct gcc_options that will be switched
between functions. Note that this gone must go again to the
option_override machinery since global options can be affected by the
target options.

> 
> * What is TARGET_ATTRIBUTE_TABLE used for? It's supposed to map attributes to 
> handlers?
> Isn't that what TARGET_OPTION_VALID_ATTRIBUTE_P is for?

I think it's different.  the TARGET_ATTRIBUTE_TABLE specifies specific
attributes (e.g naked, interrupt, ...) while the target attribute allows
to pass target flags (e.g: -marm, -mfpu=neon, ...)

> 
> * What is the use of TARGET_OPTION_SAVE and TARGET_OPTION_RESTORE? Is that 
> used during
>   something like LTO when different object files and functions are compiled 
> with different
> flags? Are these functions just supposed to 'backup' various tuning and ISA 
> decisions?
> 

This is to save custom function information that are not restored by
TARGET_SET_CURRENT_FUNCTION. I didn't need it for arm/thumb.


> * Is TARGET_COMP_TYPE_ATTRIBUTES the one that's supposed to handle 
> incompatible attributes
> being specified? (for example incompatible endianness or architecture levels)?

like TARGET_ATTRIBUTE_TABLE, it's different and doesn't pertain to
attribute target

Cheers

Christian

> 
> Thanks for any insight,
> Kyrill
> 



Re: interest for ARM/thumb multiversionning ?

2015-04-30 Thread Christian Bruel
to clarify, my use case was slightly more different than the x86 that
requires a runtime cpu-check builtin. I was more focused on a link time
problem (so we don't even need to go thu a function ptr)

Christian

On 04/30/2015 08:45 AM, Christian Bruel wrote:
> 
> 
> On 04/29/2015 05:36 PM, Ramana Radhakrishnan wrote:
>>
>>
>> On 29/04/2015 09:24, Christian Bruel wrote:
>>> Hi Ramana, Richard
>>>
>>> After playing with the attritute ((target ("[thumb,arm]")), during the
>>> pending review, I added the "default" selector to neutralize
>>> -mflip-thumb for the setjmp/longjmp based tests.
>>>
>>> I was wondering it there would be an interest leverage on this to
>>> implement multiprocessing, like on the x86 ?
>>>
>>
>> You mean multiversioning ? How would the dispatcher work in this case ?
> 
> not sure what you mean, the fonction's name will need to be mangled with
> the target specialization. The dispatching would be made based on the
> caller mode.
> 
> Could it  be also a direction to help LTOization with the proper FPU
> flags (follow bz target/65837) given at link time.
> 
> My concern is that this is limited to C++ for x86. I haven't checked the
> details, just ideas.
> 
> Cheers
> 
> Christian
> 
>> Ramana
>>
>>> something that would allow (from the x86 doc)
>>>
>>> __attribute__ ((target ("default")))
>>> int foo ()
>>> {
>>>asm("...");
>>> return 0;
>>> }
>>>
>>> __attribute__ ((target ("thumb")))
>>> int foo ()
>>> {
>>>asm("...");
>>> }
>>>
>>> int main ()
>>> {
>>> int (*p)() = &foo;
>>> assert ((*p) () == foo ());
>>> return 0;
>>> }
>>>
>>> I had initially not planned to do it, but this is a simple extension of
>>> the attribute target, if someone find a use for this I can implement it
>>> on the fly.
>>>
>>> Best Regards,
>>>
>>> Christian
>>>


Re: interest for ARM/thumb multiversionning ?

2015-04-29 Thread Christian Bruel


On 04/29/2015 05:36 PM, Ramana Radhakrishnan wrote:
> 
> 
> On 29/04/2015 09:24, Christian Bruel wrote:
>> Hi Ramana, Richard
>>
>> After playing with the attritute ((target ("[thumb,arm]")), during the
>> pending review, I added the "default" selector to neutralize
>> -mflip-thumb for the setjmp/longjmp based tests.
>>
>> I was wondering it there would be an interest leverage on this to
>> implement multiprocessing, like on the x86 ?
>>
> 
> You mean multiversioning ? How would the dispatcher work in this case ?

not sure what you mean, the fonction's name will need to be mangled with
the target specialization. The dispatching would be made based on the
caller mode.

Could it  be also a direction to help LTOization with the proper FPU
flags (follow bz target/65837) given at link time.

My concern is that this is limited to C++ for x86. I haven't checked the
details, just ideas.

Cheers

Christian

> Ramana
> 
>> something that would allow (from the x86 doc)
>>
>> __attribute__ ((target ("default")))
>> int foo ()
>> {
>>asm("...");
>> return 0;
>> }
>>
>> __attribute__ ((target ("thumb")))
>> int foo ()
>> {
>>asm("...");
>> }
>>
>> int main ()
>> {
>> int (*p)() = &foo;
>> assert ((*p) () == foo ());
>> return 0;
>> }
>>
>> I had initially not planned to do it, but this is a simple extension of
>> the attribute target, if someone find a use for this I can implement it
>> on the fly.
>>
>> Best Regards,
>>
>> Christian
>>


interest for ARM/thumb multiversionning ?

2015-04-29 Thread Christian Bruel
Hi Ramana, Richard

After playing with the attritute ((target ("[thumb,arm]")), during the
pending review, I added the "default" selector to neutralize
-mflip-thumb for the setjmp/longjmp based tests.

I was wondering it there would be an interest leverage on this to
implement multiprocessing, like on the x86 ?

something that would allow (from the x86 doc)

__attribute__ ((target ("default")))
int foo ()
{
  asm("...");
return 0;
}

__attribute__ ((target ("thumb")))
int foo ()
{
  asm("...");
}

int main ()
{
int (*p)() = &foo;
assert ((*p) () == foo ());
return 0;
}

I had initially not planned to do it, but this is a simple extension of
the attribute target, if someone find a use for this I can implement it
on the fly.

Best Regards,

Christian


Re: GCC 5 Status Report (2015-01-08), Stage 4 to start soon

2015-01-09 Thread Christian Bruel

Hi Ramana,

any chance to get the attribute target support for ARM review in time 
for stage 4 ?


Many thanks

Christian

On 01/08/2015 11:32 AM, Jakub Jelinek wrote:

The trunk is still in Stage 3 now, which means it is open for general
bugfixing, but will enter Stage 4 on Friday, 16th, end of day (timezone
of your preference).  Once that happens, only wrong-code fixes, regression
bugfixes and documentation fixes will be allowed, as is normal for
our release branches too.

There are still a few patches that have been posted during Stage 1,
please get them committed into trunk before Stage 4 starts.

Still misleading quality data below - some P3 bugs have not been
re-prioritized.

Quality Data


Priority  #   Change from last report
---   ---
P1   39+  24
P2   98+  15
P3   48-  84
---   ---
Total   185-  45


Previous Report
===

https://gcc.gnu.org/ml/gcc/2014-11/msg00249.html



Re: MULTILIB_OPTIONS and DRIVER_SELF_SPEC

2012-05-21 Thread Christian Bruel


On 05/11/2012 03:16 PM, Paulo J. Matos wrote:
> Hi,
> 
> MULTILIB_OPTIONS containing options defined in DRIVER_SELF_SPEC seemed 
> to be fine in GCC46 but fail in GCC47.
> 
> For example, I have:
> xap.h:
> #define DRIVER_SELF_SPECS   \
>  "%{help:-v} %  "%{mno-args-span-regs-and-mem:-mno-split-args} 
> %  "%{mno-inline-block-copy-mode:-mno-block-copy} 
> %  "%{mpu:-mno-block-copy -mfunction-ptr-pi} % 
> t-xap:
> MULTILIB_OPTIONS= msmall-mode/mpu
> 
> However, while building GCC I get that xgcc cannot understand -mpu:
> Running configure in multilib subdir mpu
> pwd: 
> /home/pm18/p4ws/pm18_binutils/bc/main/result/linux/intermediate/FirmwareGcc47Package/xap-local-xap
> mkdir mpu
> configure: creating cache ./config.cache
> checking build system type... i686-pc-linux-gnu
> checking host system type... xap-local-xap
> checking for --enable-version-specific-runtime-libs... no
> checking for a BSD-compatible install... /usr/bin/install -c
> checking for gawk... gawk
> xgcc: error: unrecognized command line option '-mpu'
> 
> 
> What happened in GCC47 for this to occur?

Options not explicitly described in the compiler before their use in a
spec rules are now rejected. So you probably need to describe it into
your target optimization file, (something like xap.opt).

Cheers

Christian
> 
> Cheers,
> 


Re: gcc doesn't accept specs options anymore

2012-05-07 Thread Christian Bruel


On 05/07/2012 03:11 PM, Christian Bruel wrote:
> 
> 
>> What about a generic name such as -fextension- (or both -fextension- and 
>> -mextension-) for options that GCC itself will ignore, if -mbsp= is 
>> considered inappropriate?  I'd prefer that to delimiting such options with 
>> --start-specs and --end-specs.
>>
> 
> you mean, gcc would ignore options in the -fextension string ?. For
> instance an invocation would be
> 
> gcc -spec=board.spec -foo -fextension-foo ?

as a matter of fact

gcc -spec=board.spec -fextension-foo

could be enough



> 
> instead of
> 
> gcc -spec=board.spec --start-specs -foo --end-specs
> 
> OK, both allow to fix the problem, with a minor backward compatibility
> for our BSP integrator that could be handled easily.
> 
> If agreement, are you going to propose something ?, or do you want to
> wait me to propose a patch ?
> 
> Many thanks,
> 
> Christian


Re: gcc doesn't accept specs options anymore

2012-05-07 Thread Christian Bruel


> What about a generic name such as -fextension- (or both -fextension- and 
> -mextension-) for options that GCC itself will ignore, if -mbsp= is 
> considered inappropriate?  I'd prefer that to delimiting such options with 
> --start-specs and --end-specs.
> 

you mean, gcc would ignore options in the -fextension string ?. For
instance an invocation would be

gcc -spec=board.spec -foo -fextension-foo ?

instead of

gcc -spec=board.spec --start-specs -foo --end-specs

OK, both allow to fix the problem, with a minor backward compatibility
for our BSP integrator that could be handled easily.

If agreement, are you going to propose something ?, or do you want to
wait me to propose a patch ?

Many thanks,

Christian


Re: gcc doesn't accept specs options anymore

2012-05-07 Thread Christian Bruel
> I think http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49858 is
> essentially this issue.  It can probably be closed as "won't fix",
> though I notice the spec file format is still documented in the user
> manual.
> 
> Peter
> 

yes, same root problem, although BSP design is a different usage (yet
quite common). I wouldn't be in favor moving all the spec support to the
GCC internals if this deprecates the ‘-specs=’ user option .

many thanks

Christian


Re: gcc doesn't accept specs options anymore

2012-05-07 Thread Christian Bruel


On 05/07/2012 12:09 PM, Joseph S. Myers wrote:
> On Mon, 7 May 2012, Christian Bruel wrote:
> 
>> Making the driver aware about all possible user defined options seems
>> unpredictable. Was there any justification on removing this
>> functionality or did I miss a point with the EXTRA_SPECS ?
> 
> There are several motivations behind requiring all options to be defined 
> in .opt files, including:
> 
> * For multilibs to be selected based on the semantics of options, using 
> values set in gcc_options structures by the same code as in cc1, rather 
> than by textual matching attempting to replicate semantics, the driver 
> needs to understand the semantics of options as similarly as possible to 
> cc1, rather than treating any options purely textually.
> 
> * Every option supported by the compiler should be listed in --help (and 
> if the missing help information were all filled it, we could then make it 
> a build failure to have an option without help information).

True but this removes the flexibility for a user or a BSP maintainer to
define new options, e.g to the linker not the compiler, without access
to the compiler sources using a --spec= file..

> 
> * Structured option information enables consistency in how options are 
> processed and errors given for unknown options or arguments.
> 
> * It would be useful for the compiler to be able to export structured 
> information about all its options for use by tools such as IDEs.

If the option is only supported by a BSP, and not by the compiler, I
don't see how the compiler could report it since it doesn't depend on
static information known at build time.
A direction would be to add this information in the user spec rules

*ldruntime:
+ %{foo: -lfoo} %{help: "describe foo "}

I'm not aware about such machinery. maybe an idea of improvement ?

> 
> There is certainly room for more extensibility in option handling - 
> ideally there would not be one big enum with OPT_* values for all options 
> and one header with all the macros and structures, but instead front-end 
> and back-end options would use some form of separate namespace for their 
> options so the language- and target-independent compiler doesn't see the 
> options for other parts of the compiler; that fits into the modularity 
> theme that ideally it would be possible to build multiple front ends and 
> back ends into the compiler at once, or to build front ends and back ends 
> separately from the compiler.  But defining options through use in specs 
> wouldn't be part of that; rather more structured information about each 
> option would need to be provided somehow by a separately built front end 
> or back end.
> 
> If you want -m options to select arbitrary board support packages (and the 
> existing ability to use -T to name a linker script isn't sufficient), then 
> a -mbsp= option, whose argument is not interpreted by GCC but may be 
> processed by whatever specs you are adding after GCC is installed, would 
> seem better than lots of separate -m options.

I don't like this -mbsp= alternative a lot, seems confusing, not
elegant, and not general for other uses (could be a runtime
customization, not bsp).
What about delimiters, something like --start-specs ... --end-specs ?

> 
> As for options in specs included with GCC: they are all meant to be in the 
> .opt files.  I went through all the specs in all the config/ headers in 
> GCC and added options found to .opt files before disallowing options not 
> included in .opt files, but as there are about 500 such headers it's quite 
> possible I missed some specs-defined options in the process.

yes it looks ok for the GCC specs, the problem is for the user spec
files. This is a new legacy issue, I thought it was worth to either
report it, and see if this need/can be fixed.

many thanks

Christian


> 


gcc doesn't accept specs options anymore

2012-05-07 Thread Christian Bruel
Hello,

There are a few EXTRA_SPECS rules that are used to custom target runtime
support. For instance, "ldruntime" is used on superh for board
configurations and dynamically support different runtime behaviors.

Illustration of this use with a silly reduced spec

*ldruntime:
+ %{mfoo: -lfoo}

The same kind of example could be found with the x86 cc1_cpu spec rules,

However since revision:

r171307 | jsm28 | 2011-03-22 23:19:01 +0100 (Tue, 22 Mar 2011) | 5 lines

* gcc.c (driver_unknown_option_callback): Only permit and save
unknown -Wno- options.
(driver_wrong_lang_callback): Save options directly instead of via
driver_unknown_option_callback.


using a spec defined option result in driver error:

gcc: error: unrecognized command line option '-mfoo'

Making the driver aware about all possible user defined options seems
unpredictable. Was there any justification on removing this
functionality or did I miss a point with the EXTRA_SPECS ?

Any thought ?

Thanks a lot,

Christian




Re: Discussion: What is unspec_volatile?

2010-11-15 Thread Christian Bruel

On 11/13/2010 08:40 PM, Peter Bergner wrote:

On Sat, 2010-11-13 at 11:27 +0100, Paolo Bonzini wrote:

On 11/12/2010 03:25 PM, H.J. Lu wrote:

IRA may move instructions across an unspec_volatile,


Do you have a testcase?


Are you sure it's IRA and not our old friend update_equiv_regs()
which IRA calls?  http://gcc.gnu.org/PR41171 shows an example
where update_equiv_regs() moves code around.

Peter



I'm just having a similar issue on SH4. The machine description inserts 
a unspec_volatile when generating a PIC access to the stack_chk_guard 
symbol to avoid combine them into a mem (R0,Rx) addressing mode, 
generating a unable to find a register to spill in class 'R0_REGS' spill 
failure.


The simplified RTL sequence was like, before ira:

(insn 33 32 34 5 (set (reg:SI 175)
(plus:SI (reg/f:SI 174)
(reg:SI 12 r12)))

(insn 34 33 35 5 (unspec_volatile [
(const_int 0 [0])
] 0)

(insn 35 34 36 5 (set (reg/f:SI 173)
(mem/u/c:SI (reg:SI 175) [0 S4 A32]))

Then during IRA :

(insn 35 56 55 5 (set (reg/f:SI 1 r1 [173])
(mem/u/c:SI (plus:SI (reg/f:SI 1 r1 [174])
(reg:SI 12 r12)) [0 S4 A32]))

So insn 33 has been moved across the unspec_volatile by 
'update_equiv_regs'.


So, back to the original question. Is unspec_volatile expected to avoid 
this ?


The conservative illustrative attached patch fixed my problem, but this 
should clearly need to refined because it's also prevents combines of 
insns that are not concerned by the blockage. I also suspect that there 
are other places in the compiler where instructions could be combined 
without checking the unspec_volatile.



Index: ira.c
===
--- ira.c   (revision 166230)
+++ ira.c   (working copy)
@@ -2304,6 +2304,16 @@
 only mark all destinations as having no known equivalence.  */
  if (set == 0)
{
+ if (GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE)
+   {
+ int i;
+ /* UNSPEC_VOLATILE is considered to use and clobber all hard 
+  registers and all of memory.  This blocks insns from being
+  combined across this point.  */
+ for (i = FIRST_PSEUDO_REGISTER; i < reg_equiv_init_size; i++)
+   reg_equiv[i].replace = 0;
+   }
+
  note_stores (PATTERN (insn), no_equiv, NULL);
  continue;
}


Re: SH optimized software floating point routines

2010-07-23 Thread Christian Bruel

Joern Rennecke wrote:

Quoting Christian Bruel :


Using the ieee-sf.S + this patch
OK


Is this only a proof-of-concept, because you only change the ne[sd]f2  
implementation?  


I changed also the unordered comparison patterns. (cmpunsf_i1, 
cmpundf_i1). But yes, the other functions that would need the same kind 
of check would be unordsf2, and all the comparisons (gtsf2, gesf2f...) 
for floats and doubles.
But I will only consider those after/if we all agree that this needs to 
be done instead of keeping the current QNaN only restrictions.


And you go out of your way to only accept a restricted
set of values.  


This hold for the original optimized implementation as well, for example 
I don't think that 0x7f81 was caught. In fact implementing correctly 
the isnan check without restricted set of value makes the original 
discussion pointless, since the Q/S bits are a subpart of all possible 
codings, with any fractional part != 0.


Plus, the overuse of the arithmetic unit hurts SH4-100 /

SH4-200 instruction pairing.

>

AFAICT you need only one cycle penalty, in the check_nan path:

GLOBAL(nesf2):
 /* If the raw values are unequal, the result is unequal, unless
both values are +-zero.
If the raw values are equal, the result is equal, unless
the values are NaN.  */
 cmp/eq  r4,r5
 mov.l   LOCAL(inf2),r1
 bt/s LOCAL(check_nan)
 mov r4,r0
 or  r5,r0
 rts
 add r0,r0
LOCAL(check_nan):
 add r0,r0
 cmp/hi  r1,r0
 rts
 movtr0
 .balign 4
LOCAL(inf2):
 .long 0xff00

You could even save four bytes by putting the check_nan label into the
delay slot, but I'm not sure if that'll discomfit any branch  
prediction mechanism.


Thanks a lot of this one, It should fix the original problem on the 
restricted set of values as well. The cmpund patterns fix should 
probably have a similar checks.




Disclaimer: I've not tested this code.

For the DFmode case, what about NaNs denoted by the low word, e.g.
0x7ff0 1 ?

If so, the DFmode code could become something like this:

GLOBAL(nedf2):
 cmp/eq  DBL0L,DBL1L
 mov.l   LOCAL(inf2),r1
 bf LOCAL(ne)
 cmp/eq  DBL0H,DBL1H
 bt/sLOCAL(check_nan)
 mov DBL0H,r0
 or  DBL1H,r0

 add r0,r0
 rts
 or  DBL0L,r0
LOCAL(check_nan):
 tst DBL0L,DBL0L
 add r0,r0
 subcr1,r0
 mov #-1,r0
 rts
 negcr0,r0
LOCAL(ne):
 rts
 mov #1,r0
 .balign 4
LOCAL(inf2):
 .long 0xffe0


> For an actual patch, you need to use the SL* macros from
> config/sh/lib1funcs.h because the SH1 does not have delayed branches.

OK, thanks



Re: SH optimized software floating point routines

2010-07-22 Thread Christian Bruel

oops, resending it with a small typo fix (a branch became delayed :-().

Just in case it we accepted that SNaNs and QNaNs are not exclusive and 
mimic the C model, a synthetic illustrative test case:


Compile with
sh-superh-elf-gcc -O2 -mieee -m4-nofpu snan.c snan2.c -g -o l.u ; 
sh-superh-elf-run l.u ; echo $?


Original 4.6 fp-bit C model:
OK

Using the ieee-sf.S implementation:
FAIL

Using the ieee-sf.S + this patch
OK

same for sh4-linux.

Best Regards,

Christian



Christian Bruel wrote:

Christian Bruel wrote:

Hi Kaz,

Kaz Kojima wrote:


BTW, it looks that softfp __unord?f2 routines check signaling NaNs
only.  This makes __builtin_isnan return false for quiet NaNs for
which current fp-bit ones return true when -mieee enabled.  Perhaps
that change of behavior might be OK for software FP.
I use the attached patch to handle the QNaNs in the assembly solf-fp. 
Need to be updated for trunk (and update the dates in changelogs). Will do.


Edited to apply on top of latest Joern's patch. Certainly not optimal 
but it fixes the QNaNs checks for builtins and inlined unordered 
comparisons for -mieee or -fno-inite-math-only.


Best Regards

Christian



diff '--exclude=.svn' '--exclude=*.rej' '--exclude=*~' -ubrN 
gnu_trunk.ref/gcc/gcc/config/sh/ieee-754-df.S 
gnu_trunk/gcc/gcc/config/sh/ieee-754-df.S
--- gnu_trunk.ref/gcc/gcc/config/sh/ieee-754-df.S   2010-07-21 
18:04:17.94995 +0200
+++ gnu_trunk/gcc/gcc/config/sh/ieee-754-df.S   2010-07-21 18:09:10.602376000 
+0200
@@ -92,11 +92,12 @@
HIDDEN_FUNC(GLOBAL(nedf2))
 GLOBAL(nedf2):
cmp/eq  DBL0L,DBL1L
-   mov.l   LOCAL(c_DF_NAN_MASK),r1
-   bf LOCAL(ne)
+   bf.sLOCAL(ne)
+   mov #1,r0
cmp/eq  DBL0H,DBL1H
+   mov.l   LOCAL(c_DF_NAN_MASK),r1
+   bt.sLOCAL(check_nan)
not DBL0H,r0
-   bt  LOCAL(check_nan)
mov DBL0H,r0
or  DBL1H,r0
add r0,r0
@@ -104,11 +105,17 @@
or  DBL0L,r0
 LOCAL(check_nan):
tst r1,r0
-   rts
+   bt.sLOCAL(nan)
+   mov #12,r2
+   shll16  r2
+   xor r2,r1
+   tst r1,r0
+LOCAL(nan):
movtr0
 LOCAL(ne):
rts
-   mov #1,r0
+   nop
+   
.balign 4
 LOCAL(c_DF_NAN_MASK):
.long DF_NAN_MASK
diff '--exclude=.svn' '--exclude=*.rej' '--exclude=*~' -ubrN 
gnu_trunk.ref/gcc/gcc/config/sh/ieee-754-sf.S 
gnu_trunk/gcc/gcc/config/sh/ieee-754-sf.S
--- gnu_trunk.ref/gcc/gcc/config/sh/ieee-754-sf.S   2010-07-22 
14:21:50.606831000 +0200
+++ gnu_trunk/gcc/gcc/config/sh/ieee-754-sf.S   2010-07-22 15:30:17.928097000 
+0200
@@ -58,6 +58,12 @@
add r0,r0
 LOCAL(check_nan):
tst r1,r0
+   bt.sLOCAL(nan)
+   mov #96,r2
+   shll16  r2
+   xor r2,r1
+   tst r1,r0   
+ LOCAL(nan):   
rts
movtr0
.balign 4
diff '--exclude=.svn' '--exclude=*.rej' '--exclude=*~' -ubrN 
gnu_trunk.ref/gcc/gcc/config/sh/sh.md gnu_trunk/gcc/gcc/config/sh/sh.md
--- gnu_trunk.ref/gcc/gcc/config/sh/sh.md   2010-07-21 18:06:25.978547000 
+0200
+++ gnu_trunk/gcc/gcc/config/sh/sh.md   2010-07-22 09:13:12.599669000 +0200
@@ -10262,6 +10262,7 @@
(clobber (reg:SI T_REG))
(clobber (reg:SI PR_REG))
(clobber (reg:SI R1_REG))
+   (clobber (reg:SI R2_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
   "TARGET_SH1 && ! TARGET_SH2E"
   "jsr @%1%#"
@@ -10337,13 +10338,18 @@
 
 (define_insn "cmpunsf_i1"
   [(set (reg:SI T_REG)
-   (unordered:SI (match_operand:SF 0 "arith_reg_operand" "r,r")
- (match_operand:SF 1 "arith_reg_operand" "r,r")))
-   (use (match_operand:SI 2 "arith_reg_operand" "r,r"))
-   (clobber (match_scratch:SI 3 "=0,&r"))]
+   (unordered:SI (match_operand:SF 0 "arith_reg_operand" "r")
+ (match_operand:SF 1 "arith_reg_operand" "r")))
+ (use (match_operand:SI 2 "arith_reg_operand" "r"))
+ (clobber (match_scratch:SI 3 "=&r"))]
   "TARGET_SH1 && ! TARGET_SH2E"
-  "not\t%0,%3\;tst\t%2,%3\;not\t%1,%3\;bt\t0f\;tst\t%2,%3\;0:"
-  [(set_attr "length" "10")])
+"not\t%0,%3\;tst\t%2,%3\;bt.s\t0f
+\tnot\t%1,%3\;tst\t%2,%3\;bt.s\t0f
+\tmov\t#96,%3\;shll16\t%3\;xor\t%3,%2
+\tnot\t%0,%3\;tst\t%2,%3\;bt.s\t0f
+\tnot\t%1,%3\;tst\t%2,%3
+ 0:"
+[(set_attr "length" "28")])
 
 ;; ??? This is a lot of code with a lot of branches; a library function
 ;; might be better.
@@ -11069,6 +11075,7 @@
(clobber (reg:SI T_REG))
(clobber (reg:SI PR_REG))
(clobber (reg:SI R1_REG))
+   

Re: SH optimized software floating point routines

2010-07-22 Thread Christian Bruel

Joern Rennecke wrote:

Quoting Christian Bruel :


Edited to apply on top of latest Joern's patch. Certainly not optimal
but it fixes the QNaNs checks for builtins and inlined unordered
comparisons for -mieee or -fno-inite-math-only.


You are still on the wrong track; as I said in my earlier message, we
should not emit the library call for SH4 in the first place.


>

Please try the attached patch instead.



Hello, Sorry for the mails that crossed.

I think we are dealing with 2 different problems here, that have the 
same root. Original one was about undefined __unorddf2/__unordsf2 
regression, for which you said that the library functions should not be 
called. I agree, and my patch is not exclusive with yours in this regard.


I was dealing with functional issues in the SNanS bit checking in the 
cmpun_ patterns (in addition to the floating point comparisons 
functions). Which is exposed by the regression test that I provided (for 
-m4-nofpu -mieee).


About the other part of your answer, non supporting SNaNs in the 
fp-bit.c, it is a possibility that I didn't consider in my fix. This 
restriction is quite a surprise to me because, related to NaNs, it is 
not what I guess from the implementation of the fp-bit.c's isnan 
function that does check for CLASS_SNAN, and CLASS_QNAN.


See for example the result of

static int misnanf(float v)
{
  return (v != v);
}

called with either a QNaN or a SNaN. IMO The assembly model should have 
the same semantic that the C model, which is not the case today.


Using -fsignaling-nans and eventually putting #ifdef  __SUPPORT_SNAN__ 
around the checking doesn't change anything since the same call is done 
to the floating point comparison function, that really needs to check 
for both formats. If your are concerned about the extra cycles needed in 
the nesf2f implementation (wich is nothing anyway compared to the C 
model), we could certainly provide a specialized one just for 
-fsignaling-nans.


Best Regards

Christian


Re: SH optimized software floating point routines

2010-07-22 Thread Christian Bruel

Christian Bruel wrote:

Hi Kaz,

Kaz Kojima wrote:



BTW, it looks that softfp __unord?f2 routines check signaling NaNs
only.  This makes __builtin_isnan return false for quiet NaNs for
which current fp-bit ones return true when -mieee enabled.  Perhaps
that change of behavior might be OK for software FP.


I use the attached patch to handle the QNaNs in the assembly solf-fp. 
Need to be updated for trunk (and update the dates in changelogs). Will do.


Edited to apply on top of latest Joern's patch. Certainly not optimal 
but it fixes the QNaNs checks for builtins and inlined unordered 
comparisons for -mieee or -fno-inite-math-only.


Best Regards

Christian


2010-07-22  Christian Bruel  

* gcc.dg/builtins-nan.c: New test.

2010-07-22  Christian Bruel  

* config/sh/ieee-754-df.S (nedf2f): Don't check Qbit for NaNs.
* config/sh/ieee-754-sf.S (nesf2f): Likewise.
* config/sh/sh.md (cmpunsf_i1, cmpundf_i1): Likewise. 
(cmpnesf_i1, cmpnedf_i1): Clobber R2.

diff '--exclude=.svn' '--exclude=*.rej' '--exclude=*~' -ubrN 
gnu_trunk.ref/gcc/gcc/config/sh/ieee-754-df.S 
gnu_trunk/gcc/gcc/config/sh/ieee-754-df.S
--- gnu_trunk.ref/gcc/gcc/config/sh/ieee-754-df.S   2010-07-21 
18:04:17.0 +0200
+++ gnu_trunk/gcc/gcc/config/sh/ieee-754-df.S   2010-07-21 18:09:10.0 
+0200
@@ -92,11 +92,12 @@
HIDDEN_FUNC(GLOBAL(nedf2))
 GLOBAL(nedf2):
cmp/eq  DBL0L,DBL1L
-   mov.l   LOCAL(c_DF_NAN_MASK),r1
-   bf LOCAL(ne)
+   bf.sLOCAL(ne)
+   mov #1,r0
cmp/eq  DBL0H,DBL1H
+   mov.l   LOCAL(c_DF_NAN_MASK),r1
+   bt.sLOCAL(check_nan)
not DBL0H,r0
-   bt  LOCAL(check_nan)
mov DBL0H,r0
or  DBL1H,r0
add r0,r0
@@ -104,11 +105,17 @@
or  DBL0L,r0
 LOCAL(check_nan):
tst r1,r0
-   rts
+   bt.sLOCAL(nan)
+   mov #12,r2
+   shll16  r2
+   xor r2,r1
+   tst r1,r0
+LOCAL(nan):
movtr0
 LOCAL(ne):
rts
-   mov #1,r0
+   nop
+   
.balign 4
 LOCAL(c_DF_NAN_MASK):
.long DF_NAN_MASK
diff '--exclude=.svn' '--exclude=*.rej' '--exclude=*~' -ubrN 
gnu_trunk.ref/gcc/gcc/config/sh/ieee-754-sf.S 
gnu_trunk/gcc/gcc/config/sh/ieee-754-sf.S
--- gnu_trunk.ref/gcc/gcc/config/sh/ieee-754-sf.S   2010-07-21 
18:04:18.0 +0200
+++ gnu_trunk/gcc/gcc/config/sh/ieee-754-sf.S   2010-07-21 18:09:10.0 
+0200
@@ -51,13 +51,19 @@
cmp/eq  r4,r5
mov.l   LOCAL(c_SF_NAN_MASK),r1
not r4,r0
-   bt  LOCAL(check_nan)
+   bt.sLOCAL(check_nan)
mov r4,r0
or  r5,r0
rts
add r0,r0
 LOCAL(check_nan):
tst r1,r0
+   bt.sLOCAL(nan)
+   mov #96,r2
+   shll16  r2
+   xor r2,r1
+   tst r1,r0   
+ LOCAL(nan):   
rts
movtr0
.balign 4
diff '--exclude=.svn' '--exclude=*.rej' '--exclude=*~' -ubrN 
gnu_trunk.ref/gcc/gcc/config/sh/sh.md gnu_trunk/gcc/gcc/config/sh/sh.md
--- gnu_trunk.ref/gcc/gcc/config/sh/sh.md   2010-07-21 18:06:25.0 
+0200
+++ gnu_trunk/gcc/gcc/config/sh/sh.md   2010-07-22 09:13:12.0 +0200
@@ -10262,6 +10262,7 @@
(clobber (reg:SI T_REG))
(clobber (reg:SI PR_REG))
(clobber (reg:SI R1_REG))
+   (clobber (reg:SI R2_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
   "TARGET_SH1 && ! TARGET_SH2E"
   "jsr @%1%#"
@@ -10337,13 +10338,18 @@
 
 (define_insn "cmpunsf_i1"
   [(set (reg:SI T_REG)
-   (unordered:SI (match_operand:SF 0 "arith_reg_operand" "r,r")
- (match_operand:SF 1 "arith_reg_operand" "r,r")))
-   (use (match_operand:SI 2 "arith_reg_operand" "r,r"))
-   (clobber (match_scratch:SI 3 "=0,&r"))]
+   (unordered:SI (match_operand:SF 0 "arith_reg_operand" "r")
+ (match_operand:SF 1 "arith_reg_operand" "r")))
+ (use (match_operand:SI 2 "arith_reg_operand" "r"))
+ (clobber (match_scratch:SI 3 "=&r"))]
   "TARGET_SH1 && ! TARGET_SH2E"
-  "not\t%0,%3\;tst\t%2,%3\;not\t%1,%3\;bt\t0f\;tst\t%2,%3\;0:"
-  [(set_attr "length" "10")])
+"not\t%0,%3\;tst\t%2,%3\;bt.s\t0f
+\tnot\t%1,%3\;tst\t%2,%3\;bt.s\t0f
+\tmov\t#96,%3\;shll16\t%3\;xor\t%3,%2
+\tnot\t%0,%3\;tst\t%2,%3\;bt.s\t0f
+\tnot\t%1,%3\;tst\t%2,%3
+ 0:"
+[(set_attr "length" "28")])
 
 ;; ??? This is a lot of code with a lot of branches; a library function
 ;; might be better.
@@ -11069,6 +11075,7 @@
(clobber (reg:SI T_REG))
(clobber (reg:SI

Re: SH optimized software floating point routines

2010-07-21 Thread Christian Bruel

Hi Kaz,

Kaz Kojima wrote:



BTW, it looks that softfp __unord?f2 routines check signaling NaNs
only.  This makes __builtin_isnan return false for quiet NaNs for
which current fp-bit ones return true when -mieee enabled.  Perhaps
that change of behavior might be OK for software FP.


I use the attached patch to handle the QNaNs in the assembly solf-fp. 
Need to be updated for trunk (and update the dates in changelogs). Will do.


Cheers

Christian

2010-04-20  Christian Bruel  

* gcc.dg/builtins-nan.c: New test.

2010-04-20  Christian Bruel  

* config/sh/ieee-754-df.S (nedf2f): Don't check Qbit for NaNs.
* config/sh/ieee-754-sf.S (nesf2f): Likewise.
* config/sh/sh.md (cmpunsf_i1, cmpundf_i1): Likewise. Clobber R2.




2010-04-20  Christian Bruel  

* gcc.dg/builtins-nan.c: New test.

2010-04-20  Christian Bruel  

* config/sh/ieee-754-df.S (nedf2f): Don't check Qbit for NaNs.
* config/sh/ieee-754-sf.S (nesf2f): Likewise.
* config/sh/sh.md (cmpunsf_i1, cmpundf_i1): Likewise. Clobber R2.

Index: gcc/config/sh/ieee-754-df.S
===
--- gcc/config/sh/ieee-754-df.S (revision 1352)
+++ gcc/config/sh/ieee-754-df.S (revision 1373)
@@ -88,11 +88,12 @@
HIDDEN_FUNC(GLOBAL(nedf2f))
 GLOBAL(nedf2f):
cmp/eq  DBL0L,DBL1L
+   bf.sLOCAL(ne)
+   mov #1,r0
+   cmp/eq  DBL0H,DBL1H
mov.l   LOCAL(c_DF_NAN_MASK),r1
-   bf LOCAL(ne)
-   cmp/eq  DBL0H,DBL1H
-   not DBL0H,r0
-   bt  LOCAL(check_nan)
+   bt.sLOCAL(check_nan)
+   not DBL0H,r0
mov DBL0H,r0
or  DBL1H,r0
add r0,r0
@@ -100,11 +101,17 @@
or  DBL0L,r0
 LOCAL(check_nan):
tst r1,r0
-   rts
+   bt.sLOCAL(nan)
+   mov #12,r2
+   shll16  r2
+   xor r2,r1
+   tst r1,r0
+LOCAL(nan):
movtr0
 LOCAL(ne):
rts
-   mov #1,r0
+   nop
+   
.balign 4
 LOCAL(c_DF_NAN_MASK):
.long DF_NAN_MASK

Index: gcc/config/sh/ieee-754-sf.S
===
--- gcc/config/sh/ieee-754-sf.S (revision 1352)
+++ gcc/config/sh/ieee-754-sf.S (revision 1373)
@@ -55,19 +55,27 @@
   the values are NaN.  */
cmp/eq  r4,r5
mov.l   LOCAL(c_SF_NAN_MASK),r1
+   bt.sLOCAL(check_nan)
not r4,r0
-   bt  LOCAL(check_nan)
mov r4,r0
or  r5,r0
rts
add r0,r0
 LOCAL(check_nan):
tst r1,r0
+   bt.sLOCAL(nan)
+   mov #96,r2
+   shll16  r2
+   xor r2,r1
+   tst r1,r0   
+LOCAL(nan):
rts
movtr0
+   
.balign 4
 LOCAL(c_SF_NAN_MASK):
.long SF_NAN_MASK
+LOCAL(c_SF_SNAN_MASK):
ENDFUNC(GLOBAL(nesf2f))
 #endif /* L_nesf2f */
 
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md (revision 1352)
+++ gcc/config/sh/sh.md (revision 1373)
@@ -11182,6 +11182,7 @@
 (clobber (reg:SI T_REG))
 (clobber (reg:SI PR_REG))
 (clobber (reg:SI R1_REG))
+(clobber (reg:SI R2_REG))
 (use (match_operand:SI 1 "arith_reg_operand" "r"))]
"TARGET_SH1 && ! TARGET_SH2E"
"jsr@%1%#"
@@ -11257,13 +11258,18 @@
 
  (define_insn "cmpunsf_i1"
[(set (reg:SI T_REG)
-   (unordered:SI (match_operand:SF 0 "arith_reg_operand" "r,r")
- (match_operand:SF 1 "arith_reg_operand" "r,r")))
-(use (match_operand:SI 2 "arith_reg_operand" "r,r"))
-(clobber (match_scratch:SI 3 "=0,&r"))]
+   (unordered:SI (match_operand:SF 0 "arith_reg_operand" "r")
+ (match_operand:SF 1 "arith_reg_operand" "r")))
+(use (match_operand:SI 2 "arith_reg_operand" "r"))
+(clobber (match_scratch:SI 3 "=&r"))]
"TARGET_SH1 && ! TARGET_SH2E"
-   "not\t%0,%3\;tst\t%2,%3\;not\t%1,%3\;bt\t0f\;tst\t%2,%3\;0:"
-   [(set_attr "length" "10")])
+   "not\t%0,%3\;tst\t%2,%3\;bt.s\t0f
+\tnot\t%1,%3\;tst\t%2,%3\;bt.s\t0f
+\tmov\t#96,%3\;shll16\t%3\;xor\t%3,%2
+\tnot\t%0,%3\;tst\t%2,%3\;bt.s\t0f
+\tnot\t%1,%3\;tst\t%2,%3
+0:"
+   [(set_attr "length" "28")])
 
  ;; ??? This is a lot of code with a lot of branches; a library function
  ;; might be better.
@@ -11967,6 +11973,7 @@
(clobber (reg:SI T_REG))
(clobber (reg:SI PR_REG))
(clobber (reg:SI R1_REG))
+   (clobber (reg:SI R2_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
   "TARGET_SH1_SOFTFP"
   "jsr @%1%#"
@@ -12008,13 +12015,18 @@
 

fno-branch-count-reg misleading documentation woes

2008-05-28 Thread Christian BRUEL

hello,

The documentation for -fno-branch-count-reg explains that a 
dec-and-test-branch instruction is replaced by an equivalent sequence of 
instruction that decrement a register, compare it against 0, and branch. 
(see the use of the world *instead*)


This is not really true, since this option firstly disables the loop 
reversal transformation (loop-init.c::gate_rtl_doloop). As such, the 
generated code will not necessary have an inversed decrement loop count 
created and the sequence of reg testing will not be necessary a 
decrement-test-branch sequence.


another comment: -fbranch-count-reg the instruction is not necessary a 
"decrement and branch" but could also be a "decrement and compare" like 
on the SH.


would a rephrasing like the following be more accurate ?

thanks.

-c

Index: invoke.texi
===
--- invoke.texi (revision 135611)
+++ invoke.texi (working copy)
@@ -5420,10 +5420,7 @@

 @item -fno-branch-count-reg
 @opindex fno-branch-count-reg
-Do not use ``decrement and branch'' instructions on a count register,
-but instead generate a sequence of instructions that decrement a
-register, compare it against zero, then branch based upon the result.
-This option is only meaningful on architectures that support such
+Do not use ``decrement and branch/compare'' instructions on a count 
register. By setting this flag, loop inversion will be disabled. This 
option is only meaningful on architectures that support such

 instructions, which include x86, PowerPC, IA-64 and S/390.

 The default is @option{-fbranch-count-reg}.


Re: -mfmovd enabled by default for SH2A but not for SH4

2008-02-25 Thread Christian BRUEL

Hello,

Looks like you are mixing ABIs. what is you fpscr setting ?

From my understanding, if the fpscr PR bit is set to 0 the 64-bit 
operation behaves as 2 32 bit operations (paired single precision). so I 
don't think you get an address error here.


The well defined behavior of the fmov instruction depends on the 
endianess and the SZ/PR bits setting in the fpscr register. My guess is 
that the default gcc value of 32 bits fmov instruction is the one that 
matches best all sh4 configurations (SZ/PR=1 is even undefined for some 
cores).
Changing its default would be possible if you change your ABI or have 
another multilib setting for startup files. But the current situation is 
that it is usually let to the user to explicitly provide their own fpscr 
setting when then want to change the fpmov size and aligns.


Cheers,

Christian


Naveen H.S. wrote:

Hi,



Have you got this error on the real SH2A-FPU hardware?


Yes, we got this error on SH72513(SH2A) hardware. When the same code
is run on simulator, the "address error" occurs on encountering the
"fmov.d" instruction.



couldn't find any description for 8-byte alignment restriction for
double data on memory in my SH2A manual


Please refer the section 3.3 "address errors" in the SH2A software 
manual at the following link:-

http://documentation.renesas.com/eng/products/mpumcu/rej09b0051_sh2a.pdf
It is mentioned that "Double longword data accessed from other than 
double longword boundary" results in address error.


Regards,
Naveen.H.S.
KPIT Cummins Infosystems Ltd,
Pune (INDIA) 
~~	

Free download of GNU based tool-chains for Renesas' SH, H8, R8C, M16C   
and M32C Series. The following site also offers free technical support  
to its users. Visit http://www.kpitgnutools.com for details.
Latest versions of KPIT GNU tools were released on February 4, 2008.
~~





Re: C++: operator new and disabled exceptions

2007-09-28 Thread Christian BRUEL

hello,

there is a difference between calling new and new (std::nothrow) from a 
fno-exceptions context:


- new (std::nothrow) would return 0 in case of error
- new () would throw std::bad_alloc that would finish in 
std::terminate() or abort()


so there is a possible difference in behavior if an -fno-exceptions 
application relies on std::terminate().


so if you patch gcc to use the nothrow internally, you would need to 
compile all your applications and all your libraries and runtimes with 
-fcheck-new.


Christian

Christophe LYON wrote:

Hello,

I have already asked this question on gcc-help (see 
http://gcc.gnu.org/ml/gcc-help/2007-09/msg00328.html), but I would like 
 advice from GCC developers.


Basically, when I compile with -fno-exceptions, I wonder why the G++ 
compiler still generates calls to the standard new operator (the one 
that throws bad_alloc when it runs out of memory), rather than 
new(nothrow) (_ZnwjRKSt9nothrow_t) ?


In addition, do you think I can patch my GCC such that it calls 
new(nothrow) when compiling with -fno-exceptions, or is it a bad idea? 
(compatibility issues, ...)


Thanks for your recommendation,

Christophe.





-fstrict-overflow example 4.2 status

2007-07-11 Thread Christian BRUEL

hello,

The example provided with the -fstrict-overflow description in the 
http://gcc.gnu.org/gcc-4.2/changes.html page doesn't optimize as described.


Is it only a documentation bug ? The example is optimized as expected on 
the trunk.


Regards,

-c