Would it be possible to trigger lazy registration if the version is read as a zero? This would not introduce any additional atomic instructions on the fast path.
yes, that is possible. The main problem is the transition from lazy to non-lazy mode when the first exception is thrown. We must somehow stop the world for that without introducing an additional mutex. But I have though about that some more, and that is possible too, by encoding a magic value as version during the transition, which causes the other threads to block. A bit ugly, but manageable. I will implement that in a few days.
Independent of that I think we should improve the sort logic, as we still have to sort, even in lazy mode, at latest when the first exception is thrown. I have send a patch that significantly improves that step.
Best Thomas