Hi,

A while ago, I recommended to set up domains so that Proto contains its children by value, except for terminals that should either be references or values depending on the lvalue-ness. This allows to avoid dangling reference problems when storing expressions or using 'auto'.
I also said there was no overhead to doing this in the case of Boost.SIMD.

After having done more analyses with more complex code, it appears that there is indeed an overhead to doing this: it confuses the alias analysis of the compiler which becomes unable to perform some optimizations that it would otherwise normally perform.

For example, an expression like this:
r = a*b + a*b;

will not anymore get optimized to
tmp = a*b;
r = tmp + tmp;

If terminals are held by reference, the compiler can also emit extra loads, which it doesn't do if the the terminal is held by value or if all children are held by reference.

This is a bit surprising that this affects compiler optimizations like this, but this is replicable on both Clang and GCC, with all versions I have access to.

Therefore, to avoid performance issues, I'm considering moving to always using references (with the default domain behaviour), and relying on BOOST_FORCEINLINE to make it work as expected. Of course this has the caveat that if the force inline is disabled (or doesn't work), then you'll get segmentation faults.
_______________________________________________
proto mailing list
proto@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/proto

Reply via email to