> MacBook Pro, 10.6.6, Core 2 Duo > ProtoContext ProtoTransform > ProtoLambda Loop > GCC 4.2.1 (Apple) : 5.3565438 5.3721942 > 126.38458 1.3657978 > GCC 4.4.5 : 1.8878364 1.8845548 > 70.056237 0.942303 > GCC 4.5.2 : 1.8840608 1.889619 > 1.2806688 1.0589558 > GCC 4.6.0 (2/5/11): 1.8854768 1.8834438 > 1.278347 1.2345208 > CLANG 2.9 (125472): 5.455976 5.4627628 > 3.825104 1.2330524 > > Now, removing the ((noinline)), gives (in the same order) > > GCC 4.2.1 (Apple) : 4.1448478 5.3795842 126.53211 > 1.3215378 > GCC 4.4.5 : 1.2505956 1.2500816 69.409665 > 0.7198288 > GCC 4.5.2 : 0.596143 0.7213138 0.71969283 > 0.7211534 > GCC 4.6.0 (2/5/11): 1.2942638 1.4324828 0.646147 > 0.6632324 > CLANG 2.9 (125472): 1.2975226 1.2966478 1.3849834 > 1.2452362
Interesting results. I have done a similar test for loops (for, while, with/without pointers) and obtained similar results. Everything depends on the compiler. I think the order of the above numbers will drastically change if the expression is small, like x3 = x1 + 2.0 * x2. > I'm not sure how meaningful this second set of numbers is. If the evaluation > functions are inlined, the compiler > can realize that evaluating them num_of_steps times is unnecessary since the > data isn't changing between > iterations. It then (I believe) optimizes out certain parts of the loop in > certain cases. Maybe it would be better to evaluate something with the increment assign operator, x3 += x1 + 2.0 * x2. _______________________________________________ proto mailing list [email protected] http://lists.boost.org/mailman/listinfo.cgi/proto
