On 30 March 2012 20:29, Kenneth Zadeck <zad...@naturalbridge.com> wrote: > ramana > > i get the same failure on the trunk without my patch. >
In which case I apologise and will file a bug report separately. I should really have checked :( . Ramana > > kenny > > On 03/30/2012 07:36 AM, Ramana Radhakrishnan wrote: >> >> Hi >> >> >>> I have tested this on an x86_64 with both the force lowering on and off >>> and >>> neither cause any regressions as well as extensive testing on my port. >>> >> So, just out of curiosity, I decided to run this through a >> cross-build and noticed the following ICE with eglibc. I haven't had >> the time to debug this further but it does appear as though it could >> do with some more testing on some more ports and this probably needs >> some tuning as you say. >> >> $> >> /work/cross-build/fsf/arm-none-linux-gnueabi/tools-lowersubregchanges-patched/bin/arm-none-linux-gnueabi-gcc >> -c -O2 ./besttry.c -mfloat-abi=soft -march=armv5te >> ./besttry.c: In function ‘_IO_new_file_write’: >> ./besttry.c:36:1: internal compiler error: in get_loop_body, at >> cfgloop.c:831 >> >> $> cat besttry.c >> __extension__ typedef int __ssize_t; >> extern __thread int __libc_errno __attribute__ ((tls_model >> ("initial-exec"))); >> struct _IO_FILE { >> int _fileno; >> int _flags2; >> }; >> typedef struct _IO_FILE _IO_FILE; >> _IO_new_file_write (f, >> data, >> n) >> _IO_FILE *f; >> { >> __ssize_t to_do = n; >> while (to_do> 0) >> { >> __ssize_t count = >> (__builtin_expect (f->_flags2& 2, 0) ? >> >> ({ unsigned int _sys_result = ({ register int _a1 asm ("r0"), _nr asm >> ("r7"); >> int _a3tmp = (int) ((to_do)); >> int _a2tmp = (int) ((data)); >> register int _a2 asm ("a2") = _a2tmp; >> register int _a3 asm ("a3") = _a3tmp; _nr = ((0 + 4)); >> asm volatile ("swi 0x0 @ syscall " "SYS_ify(write)" : "=r" >> (_a1) : "r" (_nr) , "r" (_a1), "r" (_a2), "r" (_a3) : "memory"); _a1; >> }); >> if (__builtin_expect (((unsigned int) (_sys_result)>= 0xfffff001u), >> 0)) >> { (__libc_errno = ((-(_sys_result)))); >> _sys_result = (unsigned int) -1; } >> (int) _sys_result; }) >> : __write (f->_fileno, data, to_do)); >> if (count< 0) >> { >> break; >> } >> to_do -= count; >> } >> } >> >> >> Ramana >> >> On 29 March 2012 22:10, Kenneth Zadeck<zad...@naturalbridge.com> wrote: >>> >>> This patch takes a different approach to fixing PR52543 than does the >>> patch >>> in >>> >>> http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00641.html >>> >>> This patch transforms the lower-subreg pass(es) from unconditionally >>> splitting wide moves, zero extensions, and shifts, so that it now takes >>> into >>> account the target specific costs and only does the transformations if it >>> is >>> profitable. >>> >>> Unconditional splitting is a problem that not only occurs on the AVR but >>> is >>> also a problem on the ARM NEON and my private port. Furthermore, it is a >>> problem that is likely to occur on most modern larger machines since >>> these >>> machines are more likely to have fast instructions for moving things that >>> are larger than word mode. >>> >>> At compiler initialization time, each mode that is larger that a word >>> mode >>> is examined to determine if the cost of moving a value of that mode is >>> less >>> expensive that inserting the proper number of word sided moves. If it >>> is >>> cheaper to split it up, a bit is set to allow moves of that mode to be >>> lowered. >>> >>> A similar analysis is made for the zero extensions and shifts except that >>> lower subreg had been (and is still limited to only breaking up these >>> operations if the target size was twice the size of word mode.) Also, if >>> the analysis determines that there are no profitable transformations, the >>> pass exits quickly without doing any analysis. >>> >>> It is quite likely that most ports will have to be adjusted after this >>> patch >>> is accepted. For instance, the analysis discovers that there are no >>> profitable transformations to be performed on the x86-64. Since this >>> is >>> not my platform, I have no idea if these are the correct settings. But >>> the >>> pass uses the standard insn_rtx_cost interface and it is the port >>> maintainers responsibility to not lie to the optimization passes so this >>> extra work in stage one should be acceptable. >>> >>> I do know from a private conversation with Richard Sandiford, that mips >>> patches are likely forthcoming. >>> >>> There is preprocessor controlled code that prints out the cost analysis. >>> Only a summary of this can go in the subregs dump file because the >>> analysis >>> is called from backend_init_target and so the dump file is not available. >>> But it is very useful to define LOG_COSTS when adjusting your port. >>> >>> There is also preprocessor code that forces all of the lowering >>> operations >>> to marked as profitable. This is useful in debugging the new logic. >>> >>> Both of these preprocessor symbols are documented at the top of the pass. >>> >>> I have tested this on an x86_64 with both the force lowering on and off >>> and >>> neither cause any regressions as well as extensive testing on my port. >>> >>> Ok to commit? >>> >>> Kenny >>> >>> 2012-03-29 Kenneth Zadeck<zad...@naturalbridge.com> >>> >>> * toplev.c (backend_init_target): Call initializer for lower-subreg >>> pass. >>> >>> * lower-subreg.c (move_modes_to_split, splitting_ashift, >>> splitting_lshiftrt) >>> splitting_zext, splitting_some_shifts, twice_word_mode, >>> something_to_do, >>> word_mode_move_cost, move_zero_cost): New static vars. >>> (compute_move_cost, profitable_shift_p, init_lower_subreg): New >>> functions. >>> (find_pseudo_copy, resolve_simple_move): Added code to only split >>> based >>> on costs. >>> (find_decomposable_subregs): Added code to mark as decomposable >>> moves that are not profitable. >>> (find_decomposable_shift_zext): Added code to only decompose >>> shifts and zext if profitable. >>> (resolve_shift_zext): Added comment. >>> (decompose_multiword_subregs): Dump list of profitable >>> transformations. Add code to skip non profitable transformations. >>> >>> *rtl.h(init_lower_subreg): Added declaration. >>> >>> >