https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111588
--- Comment #5 from Mathias Stearn <redbeard0531 at gmail dot com> --- Mea culpa. The difference between boost and std was due to the code to fast-path shared_ptrs that aren't actually shared: https://github.com/gcc-mirror/gcc/blob/be34a8b538c0f04b11a428bd1a9340eb19dec13f/libstdc%2B%2B-v3/include/bits/shared_ptr_base.h#L324-L362. I still think that optimization is a good idea, even if it looks bad in this specific microbenchmark. When that is disabled, they have the same perf, even with the check for single-threaded. That said, I'd still love an opt out. For now, I'll just propose that we add a do-nothing bg thread in our benchmark main() to avoid misleading results.