------- Comment #19 from dominiq at lps dot ens dot fr  2008-04-03 15:04 -------
(In reply to comment #17)
> I tried the patch referenced in comment #16 on powerpc64-linux.  It
> allows tests pr11832.c and pr33009 to pass, but adding
> -frtl-abstract-sequences to all tests causes several tests to timeout on
> compilation, and gcc.dg/pr32912-1.c fails at execution for -m32.  I built
> the C tests from SPEC CPU2000 with "-O2 -frtl-abstract-sequences" using
> the patched compiler and ran them with the short test input.  A couple
> took a very long time to compile (try, for example, the file toke.c in
> test perlbmk) and four failed at execution time: vpr and crafty for -m32,
> gcc and gap for -m64.

I am not sure to understand: does it mean that the failures were 
introduced by the patch? i.e., did you do the same tests for without the 
patch? I don't know how '-frtl-abstract-sequences' is working, but I won't 
be surprised if its cost is quadratic in the size of the problem.

The gcc manual says:

...
-O3
Optimize yet more.  -O3 turns on all optimizations specified by -O2 and
also turns on the -finline-functions, -funswitch-loops,
-fpredictive-commoning, -fgcse-after-reload and -ftree-vectorize options.
...
-frtl-abstract-sequences
It is a size optimization method.  This option is to find identical
sequences of code, which can be turned into pseudo-procedures and then
replace all occurrences with calls to the newly created subroutine.  It is
kind of an opposite of -finline-functions.  This optimization runs at RTL
level.

Although I have no doubt that examples can be crafted that show a speed
increase with '-O3 -frtl-abstract-sequences', the net result of this
combination will probably a (much) longer compilation time and a poor
overall result.  It seems to me that '-frtl-abstract-sequences' is more
intended to complement the '-Os' option, so I did the following comparison
on the polyhedron benchmark (I don't have access to SPEC). Three tests 
failed to compile, two shown a less than 10% decrease in the executable 
size at the expense of more than a factor 3 in the compile time.

Core2Duo 2.16Ghz, i686-apple-darwin9, gcc version 4.4.0 20080403 (experimental)
(GCC) 
+ patches (including the patch referenced in comment #16):

================================================================================
Polyhedron Benchmark Validator
Copyright (C) Polyhedron Software Ltd - 2004 - All rights reserved

================================================================================

                     -Os                    -Os -frtl-...           -m64 -O3
...

  Benchmark   Compile  Executable       Compile  Executable       Compile 
Executable
       Name    (secs)     (bytes)        (secs)     (bytes)        (secs)    
(bytes)
  ---------   -------  -----------       -------  ---------       ------- 
----------
         ac      0.92       25748          1.82       25748          4.32      
46616
     aermod*    80.06      980996        298.68      931844        106.97    
1225304
        air      4.24       56048          8.41       56048          6.97      
73200
   capacita      1.86       35232          1.91       35232          3.78      
64520
    channel      0.85       25888          1.91           0          2.61      
42752
      doduc      8.47      117396          9.11      117396         15.18     
183600
    fatigue*     2.94       55556         18.51       51460          6.18      
76696
    gas_dyn      2.20      646272          2.91      642176          6.39     
700392
     induct      8.50      131152         29.73           0         13.41     
160672
      linpk      0.78       17508          0.66       17508          1.58      
38400
       mdbx      2.52       47760          2.64       47760          3.92      
68856
         nf      0.90       21804          0.93       21804         24.34     
153240
    protein      3.46       56228          6.49           0         11.10     
118240
     rnflow      4.07       55936          4.79       55936         11.31     
167240
   test_fpu      3.29       46996          3.89       46996         10.35     
154176
       tfft      0.62       17704          0.62       17704          1.42      
26392

  Benchmark   Ave Run  Number   Estim   Ave Run  Number   Estim   Ave Run 
Number   Estim
       Name    (secs) Repeats   Err %    (secs) Repeats   Err %    (secs)
Repeats   Err %
  ---------   ------- -------  -------   ------- -------  -----   -------
-------  ------
         ac     23.40       2  0.1197     23.45       2  0.0149     12.59      
2  0.0318
     aermod     44.30       2  0.0508     44.76       2  0.1251     29.77      
2  0.1411
        air     14.43       2  0.0139     15.06       5  1.2305      8.57      
3  0.1683
   capacita     82.95       2  0.0645     82.25       2  0.1088     55.79      
2  0.1434
    channel      7.34       5  0.1510     -1.00       2  0.1088      2.41      
5  1.6186
      doduc     61.77       2  0.0559     61.66       2  0.0592     42.85      
2  0.0198
    fatigue     18.19       2  0.0825     18.09       4  0.1745     10.65      
2  0.1502
    gas_dyn     29.75       3  0.1419     29.90       3  0.1623     10.21      
2  0.1322
     induct     94.22       2  0.0085     -1.00       3  0.1623     61.04      
2  0.0549
      linpk     28.37       2  0.0599     31.07       3  0.1955     28.23      
2  0.0797
       mdbx     17.47       2  0.0544     17.54       2  0.0171     15.14      
2  0.0429
         nf     35.04       2  0.0357     35.03       5  0.0733     32.39      
2  0.0232
    protein     57.37       2  0.0148     -1.00       5  0.0733     45.70      
2  0.0088
     rnflow     54.89       2  0.0173     54.87       2  0.0802     37.10      
2  0.0526
   test_fpu     22.33       2  0.0224     22.40       2  0.1407     12.73      
2  0.1688
       tfft      3.30       2  0.0909      3.31       2  0.0905      2.93      
5  0.0907


Geom Mean  Time =    27.63s                                              
17.86s

where '-frtl-abstract-...' stands for '-frtl-abstract-sequences' and '-m64 -O3
...'
for '-m64 -O3 -ffast-math -funroll-loops -ftree-loop-linear
-fomit-frame-pointer 
-finline-limit=600 --param min-vect-loop-bound=2'

[ibook-dhum] lin/test% gfc -Os -frtl-abstract-sequences channel.f90 
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccM5Aq1n.s:531:non-relocatable
subtraction expression, "L43" minus "L00000000002$pb"
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccM5Aq1n.s:531:symbol:
"L00000000002$pb" can't be undefined in a subtraction expression
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccM5Aq1n.s:509:non-relocatable
subtraction expression, "L42" minus "L00000000002$pb"
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccM5Aq1n.s:509:symbol:
"L00000000002$pb" can't be undefined in a subtraction expression
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccM5Aq1n.s:unknown:Undefined
local symbol L00000000002$pb
[ibook-dhum] lin/test% gfc -Os -frtl-abstract-sequences induct.f90
induct.f90: In function 'gen_resq_mesh':
induct.f90:3704: internal compiler error: in compensate_edge, at
reg-stack.c:2759
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
[ibook-dhum] lin/test% gfc -Os -frtl-abstract-sequences protein.f90
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccs13LFN.s:7048:non-relocatable
subtraction expression, "L729" minus "L00000000010$pb"
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccs13LFN.s:7048:symbol:
"L00000000010$pb" can't be undefined in a subtraction expression
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccs13LFN.s:6755:non-relocatable
subtraction expression, "L730" minus "L00000000010$pb"
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccs13LFN.s:6755:symbol:
"L00000000010$pb" can't be undefined in a subtraction expression
/var/folders/iU/iUj3xngxGYe3MPCc0TZUcE+++TI/-Tmp-//ccs13LFN.s:unknown:Undefined
local symbol L00000000010$pb

On a G5 1.8Ghz, powerpc-apple-darwin9, gcc version 4.4.0 20080403
(experimental) (GCC)
+ patches (including the patch referenced in comment #16), channel.f90 and 
protein.f90 compile, and induct gives an ICE:

[karma] lin/test% gfc -Os -frtl-abstract-sequences induct.f90
induct.f90: In function 'convert_lower_case':
induct.f90:719: error: unrecognizable insn:
(jump_insn 45 19 40 3 (return) -1 (nil))
induct.f90:719: internal compiler error: in extract_insn, at recog.c:1983
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

Note that all the tests fail on both platforms with gfortran 4.3.0 and
'-Os -frtl-abstract-sequences':

[ibook-dhum] lin/test% gfortran -w -Os -frtl-abstract-sequences ac.f90
ac.f90: In function 'suscep':
ac.f90:761: error: unrecognizable insn:
(insn 114 0 0 (set (reg:SI 0 ax)
        (symbol_ref:SI ("*L8") [flags 0x2])) -1 (nil))
ac.f90:761: internal compiler error: in insn_default_length, at
insn-attrtab.c:1339
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.
...



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33642

Reply via email to