On Feb 20, 2011, at 4:43 AM, Joel Falcou wrote: > On 20/02/11 12:41, Eric Niebler wrote: >> On 2/20/2011 6:40 PM, Joel Falcou wrote: >>> On 20/02/11 12:31, Karsten Ahnert wrote: >>>> It is amazing that the proto expression is faster then the naive one. >>>> The compiler must really love the way proto evaluates an expression. >>> I still dont really know why. Usual speed-up in our use cases here is >>> like ranging from 10 to 50%. >> That's weird. >> > Well, for me it's weird in the good way so I dont complain. Old version > of nt2 had cases where > we were thrice as fast as same vector+iterator based code ... > _______________________________________________ > proto mailing list > [email protected] > http://lists.boost.org/mailman/listinfo.cgi/proto
To explore the issue further I modified the original posted test code (see http://pastebin.com/1Vr9BkPP). The modifications include a transform based evaluator, a lambda expression based example, and some attributes to keep the evaluation functions from being inlined. First, the numbers (average after 5 iterations of the main loop). All compilation done with -O3 against Boost 1.45. MacBook Pro, 10.6.6, Core 2 Duo ProtoContext ProtoTransform ProtoLambda Loop GCC 4.2.1 (Apple) : 5.3565438 5.3721942 126.38458 1.3657978 GCC 4.4.5 : 1.8878364 1.8845548 70.056237 0.942303 GCC 4.5.2 : 1.8840608 1.889619 1.2806688 1.0589558 GCC 4.6.0 (2/5/11): 1.8854768 1.8834438 1.278347 1.2345208 CLANG 2.9 (125472): 5.455976 5.4627628 3.825104 1.2330524 Now, removing the ((noinline)), gives (in the same order) GCC 4.2.1 (Apple) : 4.1448478 5.3795842 126.53211 1.3215378 GCC 4.4.5 : 1.2505956 1.2500816 69.409665 0.7198288 GCC 4.5.2 : 0.596143 0.7213138 0.71969283 0.7211534 GCC 4.6.0 (2/5/11): 1.2942638 1.4324828 0.646147 0.6632324 CLANG 2.9 (125472): 1.2975226 1.2966478 1.3849834 1.2452362 I'm not sure how meaningful this second set of numbers is. If the evaluation functions are inlined, the compiler can realize that evaluating them num_of_steps times is unnecessary since the data isn't changing between iterations. It then (I believe) optimizes out certain parts of the loop in certain cases. A lot of the additional code came from Eric's cpp-next articles. Nate
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ proto mailing list [email protected] http://lists.boost.org/mailman/listinfo.cgi/proto
