Hi, Continuing investigation of fails on bootstrap I found next problem (besides the problem with unknown alignment described above): there is a mess with size_needed and epilogue_size_needed when we generate epilogue loop which also use SSE-moves, but no unrolled - that's probably the reason of the fails we saw.
Please check the attached patch - though the full testing isn't over yet. bootstraps seem to be ok as well as arrayarg.f90-test (with sse_loop enabled). On 19 November 2011 05:38, Jan Hubicka <hubi...@ucw.cz> wrote: >> Given that x86 memset/memcpy is still broken, I think we should revert >> it for now. > > Well, looking into the code, the SSE alignment issues needs work - the > alignment test merely tests whether some alignmnet is known not whether 16 > byte > alignment is known that is the cause of failures in 32bit bootstrap. I > originally > convinced myself that this is safe since we soot for unaligned load/stores > anyway. > > > I've commited the following patch that disabled SSE codegen and unbreaks atom > bootstrap. This seems more sensible to me given that the patch cumulated some > good improvements on the non-SSE path as well and we could return into the SSE > alignment issues incremntally. There is still falure in the fortran testcase > that I am convinced is previously latent issue. > > I will be offline tomorrow. If there are futher serious problems, just fell > free to revert the changes and we could look into them for next stage1. > > Honza > > * i386.c (atom_cost): Disable SSE loop until alignment issues are > fixed. > Index: i386.c > =================================================================== > --- i386.c (revision 181479) > +++ i386.c (working copy) > @@ -1783,18 +1783,18 @@ struct processor_costs atom_cost = { > /* stringop_algs for memcpy. > SSE loops works best on Atom, but fall back into non-SSE unrolled loop > variant > if that fails. */ > - {{{libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}, /* > Known alignment. */ > - {libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}}, > - {{libcall, {{2048, sse_loop}, {2048, unrolled_loop}, {-1, libcall}}}, /* > Unknown alignment. */ > - {libcall, {{2048, sse_loop}, {2048, unrolled_loop}, > + {{{libcall, {{4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment. > */ > + {libcall, {{4096, unrolled_loop}, {-1, libcall}}}}, > + {{libcall, {{2048, unrolled_loop}, {-1, libcall}}}, /* Unknown alignment. > */ > + {libcall, {{2048, unrolled_loop}, > {-1, libcall}}}}}, > > /* stringop_algs for memset. */ > - {{{libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}, /* > Known alignment. */ > - {libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}}, > - {{libcall, {{1024, sse_loop}, {1024, unrolled_loop}, /* Unknown > alignment. */ > + {{{libcall, {{4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment. > */ > + {libcall, {{4096, unrolled_loop}, {-1, libcall}}}}, > + {{libcall, {{1024, unrolled_loop}, /* Unknown alignment. */ > {-1, libcall}}}, > - {libcall, {{2048, sse_loop}, {2048, unrolled_loop}, > + {libcall, {{2048, unrolled_loop}, > {-1, libcall}}}}}, > 1, /* scalar_stmt_cost. */ > 1, /* scalar load_cost. */ -- --- Best regards, Michael V. Zolotukhin, Software Engineer Intel Corporation.
memfunc_epilogue_loops.patch
Description: Binary data