Hey all-apologies for the long delay. Haven't had time until recently to look 
into this further.
>>> The zero extract now matching against other modes would generate a
>>> test + branch rather than the combined instruction which led to the
>>> code size regression. I've updated the patch so that tbnz etc. matches GPI 
>>> and
>>that brings code size down to <0.2% in spec2017 and <0.4% in spec2006.
>>
>>That's looking better indeed. I notice there are still differences, eg. 
>>tbz/tbnz
>>counts are significantly different in perlbench, with ~350 missed cases 
>>overall
>>(mostly tbz reg, #7).
>>
>>There are also more uses of uxtw, ubfiz, sbfiz - for example I see cases like 
>>this
>>in namd:
>>
>>  42c7dc:       13007400        sbfx    w0, w0, #0, #30
>>  42c7e0:       937c7c00        sbfiz   x0, x0, #4, #32
>>
>>So it would be a good idea to check any benchmarks where there is still a non-
>>trivial codesize difference. You can get a quick idea what is happening by
>>grepping for instructions like this:
>>
>>grep -c sbfiz out1.txt out2.txt
>>out1.txt:872
>>out2.txt:934
>>
>>grep -c tbnz out1.txt out2.txt
>>out1.txt:5189
>>out2.txt:4989

That's really good insight Wilco! I took a look at the tbnz/tbz case in perl 
and we lose matching against this because allowing SI mode on extv/extzv causes 
subst in combine.c to generate:

(lshiftrt:SI (reg:SI 107 [ _16 ])
    (const_int 7 [0x7]))
(nil)

Instead of:

(and:DI (lshiftrt:DI (subreg:DI (reg:SI 107 [ _16 ]) 0)
        (const_int 7 [0x7]))
    (const_int 1 [0x1]))

The latter case is picked up in make_compound_operation_int to transform into a 
zero_extract while the new case is left alone. A lshiftrt generally can't be 
reduced down to a bit-test but in this case it can because we have zero_bit 
information on it. Given that, looking around try_combine it seems like the 
best place to detect this pattern is in the 2nd chance code after the first 
failure of recog_for_combine which I've done in this patch. I think this is the 
place to put this fix given changing subst/make_compound_operation_int leads to 
significantly more diffs.

After this change the total number of tbnz/tbz lines up near identical to the 
baseline which is good and overall size within .1% on spec 2017 and spec 2006. 
However, looking further at ubfiz there's a pretty large increase in certain 
benchmarks. I looked into spec 2017/blender and we fail to combine this pattern:

Trying 653 -> 654:
  653: r512:SI=r94:SI 0>>0x8
      REG_DEAD r94:SI
  654: r513:DI=zero_extend(r512:SI)
      REG_DEAD r512:SI
Failed to match this instruction:
(set (reg:DI 513)
    (zero_extract:DI (reg:SI 94 [ bswapdst_4 ])
        (const_int 8 [0x8])
        (const_int 8 [0x8])))

Where previously we combined it like this:

Trying 653 -> 654:
  653: r512:SI=r94:SI 0>>0x8
      REG_DEAD r94:SI
  654: r513:DI=zero_extend(r512:SI)
      REG_DEAD r512:SI
Successfully matched this instruction:
(set (reg:DI 513)
    (zero_extract:DI (subreg:DI (reg:SI 94 [ bswapdst_4 ]) 0) // subreg used
        (const_int 8 [0x8])
        (const_int 8 [0x8])))

Here's where I'm at an impasse. The code that generates the modes in 
get_best_reg_extraction_insn looks at the inner mode of SI now that extzvsi is 
valid and generates a non-subreg use. However, the MD pattern is looking for 
all modes being DI or SI not a mix. I think a fix could be done to canonicalize 
these extracts to the same mode but am unsure if in general a mode mismatched 
extract RTX is valid which would make this a fairly large change. 

Latest patch with fix for tbnz/tbz is attached alongside the numbers for SPEC 
and instruction count for SPEC 2017 are attached for reference.

>>> Can you send me the necessary documents to make that happen? Thanks!
>>
>>That's something you need to sort out with the fsf. There is a mailing list 
>>for this:
>>mailto:ass...@gnu.org.

I haven't had any response from my previous mail there. Should I add one of you 
to the CC or mail someone specifically to get traction?

Best,
Modi

Attachment: pr86901.patch
Description: pr86901.patch

base                                                    diff                    
                                % increase
text    data    bss     total   filename                        text    data    
bss     total   filename                        
1038264 243802  12472   1294538 
base/benchspec/CPU2006/400.perlbench/exe/perlbench_base.gcc9-base               
        1038344 243714  12472   1294530 
diff/benchspec/CPU2006/400.perlbench/exe/perlbench_base.gcc9-diff               
        0.01%
72024   16030   4328    92382   
base/benchspec/CPU2006/401.bzip2/exe/bzip2_base.gcc9-base                       
72016   16030   4328    92374   
diff/benchspec/CPU2006/401.bzip2/exe/bzip2_base.gcc9-diff                       
-0.01%
2651976 816398  749792  4218166 
base/benchspec/CPU2006/403.gcc/exe/gcc_base.gcc9-base                   2652096 
816558  749792  4218446 diff/benchspec/CPU2006/403.gcc/exe/gcc_base.gcc9-diff   
                0.00%
8232    4803    11912   24947   
base/benchspec/CPU2006/429.mcf/exe/mcf_base.gcc9-base                   8232    
4803    11912   24947   diff/benchspec/CPU2006/429.mcf/exe/mcf_base.gcc9-diff   
                0.00%
105912  31228   37176   174316  
base/benchspec/CPU2006/433.milc/exe/milc_base.gcc9-base                 105904  
31228   37176   174308  diff/benchspec/CPU2006/433.milc/exe/milc_base.gcc9-diff 
                -0.01%
205752  25512   496     231760  
base/benchspec/CPU2006/444.namd/exe/namd_base.gcc9-base                 204792  
25288   496     230576  diff/benchspec/CPU2006/444.namd/exe/namd_base.gcc9-diff 
                -0.47%
757960  2917634 2328968 6004562 
base/benchspec/CPU2006/445.gobmk/exe/gobmk_base.gcc9-base                       
757952  2917634 2328968 6004554 
diff/benchspec/CPU2006/445.gobmk/exe/gobmk_base.gcc9-diff                       
0.00%
2180808 880678  3528    3065014 
base/benchspec/CPU2006/447.dealII/exe/dealII_base.gcc9-base                     
2181248 880670  3528    3065446 
diff/benchspec/CPU2006/447.dealII/exe/dealII_base.gcc9-diff                     
0.02%
345832  79234   1584    426650  
base/benchspec/CPU2006/450.soplex/exe/soplex_base.gcc9-base                     
345936  79234   1584    426754  
diff/benchspec/CPU2006/450.soplex/exe/soplex_base.gcc9-diff                     
0.03%
779768  225837  161496  1167101 
base/benchspec/CPU2006/453.povray/exe/povray_base.gcc9-base                     
779904  225965  161496  1167365 
diff/benchspec/CPU2006/453.povray/exe/povray_base.gcc9-diff                     
0.02%
249528  69146   81944   400618  
base/benchspec/CPU2006/456.hmmer/exe/hmmer_base.gcc9-base                       
249536  69146   81944   400626  
diff/benchspec/CPU2006/456.hmmer/exe/hmmer_base.gcc9-diff                       
0.00%
125080  38096   2576288 2739464 
base/benchspec/CPU2006/458.sjeng/exe/sjeng_base.gcc9-base                       
125080  38096   2576288 2739464 
diff/benchspec/CPU2006/458.sjeng/exe/sjeng_base.gcc9-diff                       
0.00%
30452   11213   96      41761   
base/benchspec/CPU2006/462.libquantum/exe/libquantum_base.gcc9-base             
        30420   11221   96      41737   
diff/benchspec/CPU2006/462.libquantum/exe/libquantum_base.gcc9-diff             
        -0.11%
567320  127533  371064  1065917 
base/benchspec/CPU2006/464.h264ref/exe/h264ref_base.gcc9-base                   
567304  127533  371064  1065901 
diff/benchspec/CPU2006/464.h264ref/exe/h264ref_base.gcc9-diff                   
0.00%
8376    4584    24      12984   
base/benchspec/CPU2006/470.lbm/exe/lbm_base.gcc9-base                   8368    
4584    24      12976   diff/benchspec/CPU2006/470.lbm/exe/lbm_base.gcc9-diff   
                -0.10%
484536  216255  14528   715319  
base/benchspec/CPU2006/471.omnetpp/exe/omnetpp_base.gcc9-base                   
484528  216255  14528   715311  
diff/benchspec/CPU2006/471.omnetpp/exe/omnetpp_base.gcc9-diff                   
0.00%
33256   10075   5152    48483   
base/benchspec/CPU2006/473.astar/exe/astar_base.gcc9-base                       
33248   10075   5152    48475   
diff/benchspec/CPU2006/473.astar/exe/astar_base.gcc9-diff                       
-0.02%
155608  53889   32888   242385  
base/benchspec/CPU2006/482.sphinx3/exe/sphinx_livepretend_base.gcc9-base        
                155600  53889   32888   242377  
diff/benchspec/CPU2006/482.sphinx3/exe/sphinx_livepretend_base.gcc9-diff        
                -0.01%
2930312 1698488 11544   4640344 
base/benchspec/CPU2006/483.xalancbmk/exe/Xalan_base.gcc9-base                   
2930304 1698488 11544   4640336 
diff/benchspec/CPU2006/483.xalancbmk/exe/Xalan_base.gcc9-diff                   
0.00%
1016    1728    8       2752    
base/benchspec/CPU2006/998.specrand/exe/specrand_base.gcc9-base                 
1016    1728    8       2752    
diff/benchspec/CPU2006/998.specrand/exe/specrand_base.gcc9-diff                 
0.00%
1016    1728    8       2752    
base/benchspec/CPU2006/999.specrand/exe/specrand_base.gcc9-base                 
1016    1728    8       2752    
diff/benchspec/CPU2006/999.specrand/exe/specrand_base.gcc9-diff                 
0.00%
12733028                                                        12732844        
                                                0.00%
        tbnz                    tbz                     sbfiz                   
ubfiz                   uxtw                    sxtw            
        base    diff    diff-base       base    diff    diff-base       base    
diff    diff-base       base    diff    diff-base       base    diff    
diff-base       base    diff    diff-base
diff/benchspec/CPU/500.perlbench_r/exe/perlbench_r_base.diff-64 4394    4396    
2       4442    4444    2       215     215     0       322     323     1       
328     328     0       3217    3234    17
diff/benchspec/CPU/502.gcc_r/exe/cpugcc_r_base.diff-64  7755    7833    78      
6645    6728    83      1235    1236    1       2801    3018    217     10271   
10271   0       9770    9794    24
diff/benchspec/CPU/505.mcf_r/exe/mcf_r_base.diff-64     19      19      0       
12      12      0       1       1       0       0       0       0       0       
0       0       25      25      0
diff/benchspec/CPU/508.namd_r/exe/namd_r_base.diff-64   148     148     0       
65      65      0       724     786     62      3302    3475    173     735     
735     0       5057    5121    64
diff/benchspec/CPU/510.parest_r/exe/parest_r_base.diff-64       1295    1295    
0       1447    1448    1       1694    1694    0       5775    5899    124     
7367    7368    1       11068   11072   4
diff/benchspec/CPU/511.povray_r/exe/imagevalidate_511_base.diff-64      5       
5       0       8       8       0       3       3       0       25      27      
2       2       2       0       13      13      0
diff/benchspec/CPU/511.povray_r/exe/povray_r_base.diff-64       312     312     
0       438     438     0       245     245     0       140     154     14      
165     165     0       982     987     5
diff/benchspec/CPU/519.lbm_r/exe/lbm_r_base.diff-64     3       3       0       
3       3       0       5       5       0       0       0       0       0       
0       0       8       8       0
diff/benchspec/CPU/520.omnetpp_r/exe/omnetpp_r_base.diff-64     690     690     
0       361     361     0       336     336     0       26      26      0       
179     179     0       1216    1216    0
diff/benchspec/CPU/523.xalancbmk_r/exe/cpuxalan_r_base.diff-64  446     451     
5       905     905     0       138     138     0       1108    1249    141     
3946    3946    0       893     893     0
diff/benchspec/CPU/526.blender_r/exe/blender_r_base.diff-64     6750    6761    
11      7849    7855    6       2125    2125    0       813     952     139     
1341    1340    -1      8339    8358    19
diff/benchspec/CPU/526.blender_r/exe/imagevalidate_526_base.diff-64     5       
5       0       8       8       0       3       3       0       25      27      
2       2       2       0       13      13      0
diff/benchspec/CPU/531.deepsjeng_r/exe/deepsjeng_r_base.diff-64 43      43      
0       13      13      0       18      18      0       19      23      4       
9       9       0       414     424     10
diff/benchspec/CPU/538.imagick_r/exe/imagevalidate_538_base.diff-64     5       
5       0       8       8       0       3       3       0       25      27      
2       2       2       0       13      13      0
diff/benchspec/CPU/538.imagick_r/exe/imagick_r_base.diff-64     340     340     
0       497     497     0       2       2       0       682     682     0       
75      75      0       157     158     1
diff/benchspec/CPU/541.leela_r/exe/leela_r_base.diff-64 24      24      0       
26      26      0       13      13      0       6       10      4       44      
44      0       595     595     0
diff/benchspec/CPU/544.nab_r/exe/nab_r_base.diff-64     56      56      0       
70      70      0       92      92      0       36      47      11      10      
10      0       454     454     0
diff/benchspec/CPU/557.xz_r/exe/xz_r_base.diff-64       22      22      0       
9       9       0       1       1       0       115     118     3       251     
251     0       28      28      0
diff/benchspec/CPU/997.specrand_fr/exe/specrand_fr_base.diff-64 0       0       
0       0       0       0       0       0       0       0       0       0       
0       0       0       4       4       0
diff/benchspec/CPU/999.specrand_ir/exe/specrand_ir_base.diff-64 0       0       
0       0       0       0       0       0       0       0       0       0       
0       0       0       4       4       0
base                                            diff                            
                % increase
text    data    bss     total   filename                text    data    bss     
total   filename                
1775240 561457  8760    2345457 
base/benchspec/CPU/500.perlbench_r/exe/perlbench_r_base.base-64         1775432 
561305  8760    2345497 
diff/benchspec/CPU/500.perlbench_r/exe/perlbench_r_base.diff-64         0.01%
7755224 2191963 1146680 11093867        
base/benchspec/CPU/502.gcc_r/exe/cpugcc_r_base.base-64          7754488 2192051 
1146680 11093219        diff/benchspec/CPU/502.gcc_r/exe/cpugcc_r_base.diff-64  
        -0.01%
23064   6391    728     30183   
base/benchspec/CPU/505.mcf_r/exe/mcf_r_base.base-64             23064   6391    
728     30183   diff/benchspec/CPU/505.mcf_r/exe/mcf_r_base.diff-64             
0.00%
677176  30046   960     708182  
base/benchspec/CPU/508.namd_r/exe/namd_r_base.base-64           677816  30046   
960     708822  diff/benchspec/CPU/508.namd_r/exe/namd_r_base.diff-64           
0.09%
7082456 1704125 16128   8802709 
base/benchspec/CPU/510.parest_r/exe/parest_r_base.base-64               7084584 
1704145 16128   8804857 
diff/benchspec/CPU/510.parest_r/exe/parest_r_base.diff-64               0.03%
14072   9141    24      23237   
base/benchspec/CPU/511.povray_r/exe/imagevalidate_511_base.base-64              
14072   9141    24      23237   
diff/benchspec/CPU/511.povray_r/exe/imagevalidate_511_base.diff-64              
0.00%
789000  246142  188784  1223926 
base/benchspec/CPU/511.povray_r/exe/povray_r_base.base-64               789032  
246134  188784  1223950 
diff/benchspec/CPU/511.povray_r/exe/povray_r_base.diff-64               0.00%
10648   4744    24      15416   
base/benchspec/CPU/519.lbm_r/exe/lbm_r_base.base-64             10640   4744    
24      15408   diff/benchspec/CPU/519.lbm_r/exe/lbm_r_base.diff-64             
-0.08%
1505352 744909  46536   2296797 
base/benchspec/CPU/520.omnetpp_r/exe/omnetpp_r_base.base-64             1505344 
744925  46536   2296805 
diff/benchspec/CPU/520.omnetpp_r/exe/omnetpp_r_base.diff-64             0.00%
3992744 1933641 14752   5941137 
base/benchspec/CPU/523.xalancbmk_r/exe/cpuxalan_r_base.base-64          3992736 
1933641 14752   5941129 
diff/benchspec/CPU/523.xalancbmk_r/exe/cpuxalan_r_base.diff-64          0.00%
7994116 7841208 417344  16252668        
base/benchspec/CPU/526.blender_r/exe/blender_r_base.base-64             7994276 
7841400 417344  16253020        
diff/benchspec/CPU/526.blender_r/exe/blender_r_base.diff-64             0.00%
14072   9141    24      23237   
base/benchspec/CPU/526.blender_r/exe/imagevalidate_526_base.base-64             
14072   9141    24      23237   
diff/benchspec/CPU/526.blender_r/exe/imagevalidate_526_base.diff-64             
0.00%
76264   16488   12138272        12231024        
base/benchspec/CPU/531.deepsjeng_r/exe/deepsjeng_r_base.base-64         76296   
16488   12138272        12231056        
diff/benchspec/CPU/531.deepsjeng_r/exe/deepsjeng_r_base.diff-64         0.04%
14072   9080    24      23176   
base/benchspec/CPU/538.imagick_r/exe/imagevalidate_538_base.base-64             
14072   9080    24      23176   
diff/benchspec/CPU/538.imagick_r/exe/imagevalidate_538_base.diff-64             
0.00%
1772984 408975  5520    2187479 
base/benchspec/CPU/538.imagick_r/exe/imagick_r_base.base-64             1772152 
409119  5520    2186791 
diff/benchspec/CPU/538.imagick_r/exe/imagick_r_base.diff-64             -0.05%
174616  46829   30032   251477  
base/benchspec/CPU/541.leela_r/exe/leela_r_base.base-64         174624  46829   
30032   251485  diff/benchspec/CPU/541.leela_r/exe/leela_r_base.diff-64         
0.00%
166616  39807   381952  588375  
base/benchspec/CPU/544.nab_r/exe/nab_r_base.base-64             166616  39807   
381952  588375  diff/benchspec/CPU/544.nab_r/exe/nab_r_base.diff-64             
0.00%
131672  69274   17576   218522  
base/benchspec/CPU/557.xz_r/exe/xz_r_base.base-64               131680  69274   
17576   218530  diff/benchspec/CPU/557.xz_r/exe/xz_r_base.diff-64               
0.01%
2440    2361    5008    9809    
base/benchspec/CPU/997.specrand_fr/exe/specrand_fr_base.base-64         2440    
2361    5008    9809    
diff/benchspec/CPU/997.specrand_fr/exe/specrand_fr_base.diff-64         0.00%
2440    2361    5008    9809    
base/benchspec/CPU/999.specrand_ir/exe/specrand_ir_base.base-64         2440    
2361    5008    9809    
diff/benchspec/CPU/999.specrand_ir/exe/specrand_ir_base.diff-64         0.00%
33974268                                                33975876                
                                0.00%

Reply via email to