>> Result from http://www.suse.de/~gcctest/c++bench/polyhedron/ >> -ffast-math -funroll-loops -O3 -ftree-vectorize -march= ??? (opteron I >> think). >> 14.59s -> 21.06s (44% slower) >
I will look into it right now, but at first glance it does not look like this benchmark is built with the cost model enabled. Thanks, > >> Result on for my AMD Athlon64 4800+, >> http://physik.fu-berlin.de/~tburnus/gcc-trunk/benchmark/ >> Yesterday: 2007-09-10-r128322 >> Today: 2007-09-11-r128363 > >The problem was introduced by r128353 [1]: > >2007-09-10 Harsha Jagasia <[EMAIL PROTECTED]> > Jan Sjodin <[EMAIL PROTECTED]> > > * tree-vect-analyze.c (vect_analyze_operations): Change > comparison of loop iterations with threshold to less than > or equal to instead of less than. Reduce > min_scalar_loop_bound by one. > * tree-vect-transform.c (vect_estimate_min_profitable_iters): > Change prologue and epilogue iterations estimate to vf/2, > when unknown at compile-time. Change versioning guard > cost to taken_branch_cost. If peeling for alignment is > unknown at compile-time, change peel guard costs to one > taken branch and one not-taken branch per peeled loop. > If peeling for alignment is known but number of scalar loop > iterations is unknown at compile-time, change peel guard > costs to one taken branch per peeled loop. Change the cost > model equation to consider vector iterations as the loop > iterations less the prologue and epilogue iterations. > Change outside vector cost check to less than or equal to > zero instead of equal to zero. > (vect_do_peeling_for_loop_bound): Reduce > min_scalar_loop_bound by one. > * tree-vectorizer.h: Add TARG_COND_TAKEN_BRANCH_COST and > TARG_COND_NOT_TAKEN_BRANCH_COST. > ... > >[1]: http://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00347.html > >Uros. >