On Tue, 22 Jun 2021 at 14:21, Matthias Kretz wrote: > On Tuesday, 22 June 2021 14:51:26 CEST Jonathan Wakely wrote: > > With your suggestion > > to also drop std::tuple the number of parameters decides which > > function we call. And we don't instantiate std::tuple. And we can also > > get rid of the __try_to_lock function, which was only used to deduce > > the lock type rather than use tuple_element to get it. That's much > > nicer. > > 👍 > > > > How about optimizing a likely common case where all lockables have the > > > same > > > type? In that case we don't require recursion and can manage stack > usage > > > much > > > simpler: > > The stack usage is bounded by the number of mutexes being locked, > > which is unlikely to get large, but we can do that. > > Right. I meant simpler, because it takes a while of staring at the > recursive > implementation to understand how it works. :) > > > We can do it for try_lock too: > > > > template<typename _L1, typename _L2, typename... _L3> > > int > > try_lock(_L1& __l1, _L2& __l2, _L3&... __l3) > > { > > #if __cplusplus >= 201703L > > if constexpr (is_same_v<_L1, _L2> > > && (is_same_v<_L1, _L3> && ...)) > > { > > constexpr int _Np = 2 + sizeof...(_L3); > > unique_lock<_L1> __locks[_Np] = { > > {__l1, try_to_lock}, {__l2, try_to_lock}, {__l3, > > try_to_lock}... }; > > This does a try_lock on all lockabes even if any of them fails. I think > that's > not only more expensive but also non-conforming. I think you need to defer > locking and then loop from beginning to end to break the loop on the first > unsuccessful try_lock. >
Oops, good point. I'll add a test for that too. Here's the fixed code: template<typename _L0, typename... _Lockables> inline int __try_lock_impl(_L0& __l0, _Lockables&... __lockables) { #if __cplusplus >= 201703L if constexpr ((is_same_v<_L0, _Lockables> && ...)) { constexpr int _Np = 1 + sizeof...(_Lockables); unique_lock<_L0> __locks[_Np] = { {__l0, defer_lock}, {__lockables, defer_lock}... }; for (int __i = 0; __i < _Np; ++__i) if (!__locks[__i].try_lock()) { const int __failed = __i; while (__i--) __locks[__i].unlock(); return __i; } for (auto& __l : __locks) __l.release(); return -1; } else #endif > > for (int __i = 0; __i < _Np; ++__i) > > if (!__locks[__i]) > > return __i; > > for (auto& __l : __locks) > > __l.release(); > > return -1; > > } > > else > > #endif > > return __detail::__try_lock_impl(__l1, __l2, __l3...); > > } > > > > > if constexpr ((is_same_v<_L0, _L1> && ...)) > > > { > > > constexpr int _Np = 1 + sizeof...(_L1); > > > std::array<unique_lock<_L0>, _Np> __locks = { > > > {__l0, defer_lock}, {__l1, defer_lock}... > > > }; > > > int __first = 0; > > > do { > > > __locks[__first].lock(); > > > for (int __j = 1; __j < _Np; ++__j) > > > { > > > const int __idx = (__first + __j) % _Np; > > > if (!__locks[__idx].try_lock()) > > > { > > > for (int __k = __idx; __k != __first; > > > __k = __k == 1 ? _Np : __k - 1) > > > __locks[__k - 1].unlock(); > > > > This loop doesn't work if any try_lock fails when first==0, because > > the loop termination condition is never reached. > > Uh yes. Which is the same reason why the __j loop doesn't start from > __first + > 1. > > > I find this a bit easier to understand than the loop above, and > > correct (I think): > > > > for (int __k = __j; __k != 0; --__k) > > __locks[(__first + __k - 1) % _Np].unlock(); > > Yes, if only we had a wrapping integer type that wraps at an arbitrary N. > Like > unsigned int but with parameter, like: > > for (__wrapping_uint<_Np> __k = __idx; __k != __first; --__k) > __locks[__k - 1].unlock(); > > This is the loop I wanted to write, except --__k is simpler to write and > __k - > 1 would also wrap around to _Np - 1 for __k == 0. But if this is the only > place it's not important enough to abstract. > We might be able to use __wrapping_uint in std::seed_seq::generate too, and maybe some other places in <random>. But we can add that later if we decide it's worth it. > > [...] > > @@ -620,15 +632,45 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > * @post All arguments are locked. > > * > > * All arguments are locked via a sequence of calls to lock(), > > try_lock() - * and unlock(). If the call exits via an exception any > > locks that were - * obtained will be released. > > + * and unlock(). If this function exits via an exception any locks > that > > + * were obtained will be released. > > */ > > template<typename _L1, typename _L2, typename... _L3> > > void > > lock(_L1& __l1, _L2& __l2, _L3&... __l3) > > { > > - int __i = 0; > > - __detail::__lock_impl(__i, 0, __l1, __l2, __l3...); > > +#if __cplusplus >= 201703L > > I also considered moving it down here. Makes sense unless you want to call > __detail::__lock_impl from other functions. And if we want to make it work > for > pre-C++11 we could do > > using __homogeneous > = __and_<is_same<_L1, _L2>, is_same<_L1, _L3>...>; > int __i = 0; > __detail::__lock_impl(__homogeneous(), __i, 0, __l1, __l2, __l3...); > > We don't need tag dispatching, we could just do: if _GLIBCXX17_CONSTEXPR (homogeneous::value) ... else ... because both branches are valid for the homogeneous case, i.e. we aren't using if-constexpr to avoid invalid instantiations. So for pre-C++17 it would still compile both branches (and instantiate the recursive function templates) but would use the iterative code at run time. But given that the default -std option is gnu++17 now, I'm OK with the iterative version only being used for C++17.