On 26/10/2023 12:43, Jonathan Wakely wrote:
On 26/10/23 07:18 +0200, François Dumont wrote:
libstdc++: [_Hashtable] Use RAII type to manage rehash functor state
Replace usage of __try/__catch with a RAII type to restore rehash
functor
state when needed.
Generally I really like replacing try-catch with RAII but I have some
questions below.
libstdc++-v3/ChangeLog:
* include/bits/hashtable_policy.h (_RehashStateGuard): New.
(_Insert_base<>::_M_insert_range(_IIt, _IIt, const
_NodeGet&, false_type)):
Adapt.
* include/bits/hashtable.h (__rehash_guard_t): New.
(__rehash_state): Remove.
(_M_rehash): Remove.
(_M_rehash_aux): Rename into _M_rehash.
(_M_assign_elements, _M_insert_unique_node,
_M_insert_multi_node): Adapt.
(rehash): Adapt.
Tested under Linux x64.
Ok to commit ?
François
diff --git a/libstdc++-v3/include/bits/hashtable.h
b/libstdc++-v3/include/bits/hashtable.h
index 0857448f7ed..64071ac1fb2 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -234,6 +234,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_RehashPolicy, _Traits>;
using __enable_default_ctor
= _Hashtable_enable_default_ctor<_Equal, _Hash, _Alloc>;
+ using __rehash_guard_t
+ = __detail::_RehashStateGuard<_RehashPolicy>;
public:
typedef _Key key_type;
@@ -264,7 +266,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
private:
using __rehash_type = _RehashPolicy;
- using __rehash_state = typename __rehash_type::_State;
using __unique_keys = typename __traits_type::__unique_keys;
@@ -1200,14 +1201,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
private:
// Helper rehash method used when keys are unique.
- void _M_rehash_aux(size_type __bkt_count, true_type __uks);
+ void _M_rehash(size_type __bkt_count, true_type __uks);
// Helper rehash method used when keys can be non-unique.
- void _M_rehash_aux(size_type __bkt_count, false_type __uks);
-
- // Unconditionally change size of bucket array to n, restore
- // hash policy state to __state on exception.
- void _M_rehash(size_type __bkt_count, const __rehash_state&
__state);
+ void _M_rehash(size_type __bkt_count, false_type __uks);
};
// Definitions of class template _Hashtable's out-of-line member
functions.
@@ -1337,7 +1334,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
__buckets_ptr __former_buckets = nullptr;
std::size_t __former_bucket_count = _M_bucket_count;
- const __rehash_state& __former_state = _M_rehash_policy._M_state();
+ __rehash_guard_t __rehash_guard(_M_rehash_policy);
if (_M_bucket_count != __ht._M_bucket_count)
{
@@ -1359,6 +1356,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_assign(std::forward<_Ht>(__ht), __roan);
if (__former_buckets)
_M_deallocate_buckets(__former_buckets,
__former_bucket_count);
+ __rehash_guard._M_reset = false;
I find this confusing. Usually "reset" means that something is
cleared, so won't take an action in the destructor. e.g. if you use
std::unique_ptr::reset() then the object is destroyed immediately, and
then nothing happens in the destructor. Here it's the opposite,
_M_reset=true means that it _sould_ do something in the destructor.
The problem is the ambiguity between "reset the state in the
destructor later" and "reset the object to an empty state now".
If the member was called _M_guarded then it might be clearer.
_M_guarded=true means the guard is active, and will restore the state
later. Or _M_active, or _M_armed, or even _M_reset_in_dtor. Any of
those names avoids the confusion with the semantics of
std::unique_ptr::reset() and similar well-known APIs.
Or what I usually do is store a pointer to the guarded object in the
RAII guard type, and then just null the pointer to disarm the guard.
That means you don't need a separate bool member variable. If
_RehashStateGuard::_M_rehash_policy was called _M_guarded_obj and was
a _RehashPolicy* instead of _RehashPolicy& then disarming it would be:
__rehash_guard._M_guarded_obj = nullptr;
This seems clear to me, as it says that the guard no longer has
anything to guard, so won't do anything in the destructor.
Looks clearer to me too.
I started with a _RehashStateScope and a _M_commit() method, it could
have been worst :-)
}
__catch(...)
{
@@ -1366,7 +1364,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
// Restore previous buckets.
_M_deallocate_buckets();
- _M_rehash_policy._M_reset(__former_state);
_M_buckets = __former_buckets;
_M_bucket_count = __former_bucket_count;
}
@@ -2142,17 +2139,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__node_ptr __node, size_type __n_elt)
-> iterator
{
- const __rehash_state& __saved_state =
_M_rehash_policy._M_state();
+ __rehash_guard_t __rehash_guard(_M_rehash_policy);
std::pair<bool, std::size_t> __do_rehash
= _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_count,
__n_elt);
if (__do_rehash.first)
{
- _M_rehash(__do_rehash.second, __saved_state);
+ _M_rehash(__do_rehash.second, true_type{});
__bkt = _M_bucket_index(__code);
}
+ __rehash_guard._M_reset = false;
this->_M_store_code(*__node, __code);
This changes behaviour. Previously the try-catch was inside the call to
_M_rehash. Now we're starting to guard before calling _M_need_rehash,
We were capturing _M_rehash_policy state before the call to
_M_need_rehash so it's not different.
We really need to instantiate the guard before we call _M_need_rehash
cause it is the method changing the _M_rehash_policy state.
and we run the destructor unconditionally even if we never call
_M_rehash.
Here we only want to reset rehash state on exception.
Note that with the 2 rehash policy implementations we provide it doesn't
matter if we reset or not when we do not call _M_rehash because their
state changes only when rehash is needed. But some user's fancy rehash
policy might update their state even if no rehash is requested and in
this case it is better to reset this state only on exception.
I think that's OK, because _M_need_rehash is noexcept (but
it's confusing because we have a const overload of _M_need_rehash
which is noexcept(false) and a non-const overload which is
noexcept(true) ... should they both be noexcept?)
That's another subject but yes, _M_need_rehash should be non-const and
noexcept on both rehash policy implementations. I considered doing such
a thing but won't it impact abi as _Prime_rehash_policy._M_need_rehash
symbols is coming from the lib ?
Can we do this instead:
- const __rehash_state& __saved_state = _M_rehash_policy._M_state();
std::pair<bool, std::size_t> __do_rehash
= _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_count,
__n_elt);
if (__do_rehash.first)
{
+ __rehash_guard_t __rehash_guard(_M_rehash_policy);
Clearly no, we need to capture state before _M_need_rehash. Like it was
done with the __saved_state above.
- _M_rehash(__do_rehash.second, __saved_state);
+ _M_rehash(__do_rehash.second, true_type{});
+ __rehash_guard._M_reset = false;
__bkt = _M_bucket_index(__code);
}
this->_M_store_code(*__node, __code);
This only creates a guard object if we're actually going to rehash, so
we don't run its destructor (and check if we need t restore anything)
when no rehash is needed.
N.B. _M_bucket_index is also confusing. There are two overloads of it,
one is noexcept and one isn't. The one that we call here is
noexcept(false), so that call could throw ... is it OK that we don't
restore the old state if that happens? Should that be guarded too? (I
have no idea).
Also, I see that the noexcept(true) overload of _M_bucket_index calls
__hash_code_base::_M_bucket_index, which is noexcept(/* condition */),
so the caller says it cannot throw, but it calls something that
sometimes throws. Is that correct?
It can be noexcept because usage of hash code cache depends on either
hash functor is noexcept or not.
// Always insert at the beginning of the bucket.
@@ -2172,13 +2170,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__hash_code __code, __node_ptr __node)
-> iterator
{
- const __rehash_state& __saved_state =
_M_rehash_policy._M_state();
+ __rehash_guard_t __rehash_guard(_M_rehash_policy);
std::pair<bool, std::size_t> __do_rehash
= _M_rehash_policy._M_need_rehash(_M_bucket_count,
_M_element_count, 1);
if (__do_rehash.first)
- _M_rehash(__do_rehash.second, __saved_state);
+ _M_rehash(__do_rehash.second, false_type{});
+ __rehash_guard._M_reset = false;
Again, we're creating the guard object in a larger scope than needed.
this->_M_store_code(*__node, __code);
const key_type& __k = _ExtractKey{}(__node->_M_v());
size_type __bkt = _M_bucket_index(__code);
@@ -2509,39 +2508,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
rehash(size_type __bkt_count)
{
- const __rehash_state& __saved_state =
_M_rehash_policy._M_state();
+ __rehash_guard_t __rehash_guard(_M_rehash_policy);
__bkt_count
= std::max(_M_rehash_policy._M_bkt_for_elements(_M_element_count
+ 1),
__bkt_count);
__bkt_count = _M_rehash_policy._M_next_bkt(__bkt_count);
if (__bkt_count != _M_bucket_count)
- _M_rehash(__bkt_count, __saved_state);
- else
- // No rehash, restore previous state to keep it consistent with
- // container state.
- _M_rehash_policy._M_reset(__saved_state);
- }
-
- template<typename _Key, typename _Value, typename _Alloc,
- typename _ExtractKey, typename _Equal,
- typename _Hash, typename _RangeHash, typename _Unused,
- typename _RehashPolicy, typename _Traits>
- void
- _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
- _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_rehash(size_type __bkt_count, const __rehash_state& __state)
- {
- __try
- {
- _M_rehash_aux(__bkt_count, __unique_keys{});
- }
- __catch(...)
{
- // A failure here means that buckets allocation failed. We only
- // have to restore hash policy previous state.
- _M_rehash_policy._M_reset(__state);
- __throw_exception_again;
+ _M_rehash(__bkt_count, __unique_keys{});
+ __rehash_guard._M_reset = false;
And a larger scope again.
}
}
@@ -2553,7 +2529,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
void
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_rehash_aux(size_type __bkt_count, true_type /* __uks */)
+ _M_rehash(size_type __bkt_count, true_type /* __uks */)
{
__buckets_ptr __new_buckets = _M_allocate_buckets(__bkt_count);
__node_ptr __p = _M_begin();
@@ -2596,7 +2572,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
void
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_rehash_aux(size_type __bkt_count, false_type /* __uks */)
+ _M_rehash(size_type __bkt_count, false_type /* __uks */)
{
__buckets_ptr __new_buckets = _M_allocate_buckets(__bkt_count);
__node_ptr __p = _M_begin();
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h
b/libstdc++-v3/include/bits/hashtable_policy.h
index 5d162463dc3..8b9626b1575 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -715,6 +715,26 @@ namespace __detail
std::size_t _M_next_resize;
};
+ template<typename _RehashPolicy>
+ struct _RehashStateGuard
+ {
+ _RehashPolicy& _M_rehash_policy;
+ typename _RehashPolicy::_State _M_prev_state;
+ bool _M_reset = true;
This could be:
+ _RehashPolicy* _M_guarded_obj;
+ typename _RehashPolicy::_State _M_prev_state;
+
+ _RehashStateGuard(_RehashPolicy& __policy)
+ : _M_rehash_policy(__policy)
+ , _M_prev_state(__policy._M_state())
+ _RehashStateGuard(_RehashPolicy& __policy)
+ : _M_guarded_obj(std::__addressof(__policy))
+ , _M_prev_state(__policy._M_state())
+ { }
+ _RehashStateGuard(const _RehashStateGuard&) = delete;
+
+ ~_RehashStateGuard()
+ {
+ if (_M_reset)
+ _M_rehash_policy._M_reset(_M_prev_state);
+ if (_M_guarded_obj)
+ _M_guarded_obj->_M_reset(_M_prev_state);
+ }
+ };
+
// Base classes for std::_Hashtable. We define these base classes
// because in some cases we want to do different things depending on
// the value of a policy class. In some cases the policy class
@@ -1007,7 +1027,7 @@ namespace __detail
const _NodeGetter& __node_gen, false_type __uks)
{
using __rehash_type = typename __hashtable::__rehash_type;
- using __rehash_state = typename __hashtable::__rehash_state;
+ using __rehash_guard_t = typename __hashtable::__rehash_guard_t;
using pair_type = std::pair<bool, std::size_t>;
size_type __n_elt = __detail::__distance_fw(__first, __last);
@@ -1016,14 +1036,15 @@ namespace __detail
__hashtable& __h = _M_conjure_hashtable();
__rehash_type& __rehash = __h._M_rehash_policy;
- const __rehash_state& __saved_state = __rehash._M_state();
+ __rehash_guard_t __rehash_guard(__rehash);
pair_type __do_rehash = __rehash._M_need_rehash(__h._M_bucket_count,
__h._M_element_count,
__n_elt);
if (__do_rehash.first)
- __h._M_rehash(__do_rehash.second, __saved_state);
+ __h._M_rehash(__do_rehash.second, __uks);
+ __rehash_guard._M_reset = false;
A larger scope again.
Note that there is only one place where we reset rehash policy state
even if no exception took place. It is in the std rehash method when the
user requests a rehash that has no effect.
Here is the updated patch, ok to commit ?
diff --git a/libstdc++-v3/include/bits/hashtable.h
b/libstdc++-v3/include/bits/hashtable.h
index 0857448f7ed..89132430f3e 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -234,6 +234,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_RehashPolicy, _Traits>;
using __enable_default_ctor
= _Hashtable_enable_default_ctor<_Equal, _Hash, _Alloc>;
+ using __rehash_guard_t
+ = __detail::_RehashStateGuard<_RehashPolicy>;
public:
typedef _Key key_type;
@@ -264,7 +266,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
private:
using __rehash_type = _RehashPolicy;
- using __rehash_state = typename __rehash_type::_State;
using __unique_keys = typename __traits_type::__unique_keys;
@@ -1200,14 +1201,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
private:
// Helper rehash method used when keys are unique.
- void _M_rehash_aux(size_type __bkt_count, true_type __uks);
+ void _M_rehash(size_type __bkt_count, true_type __uks);
// Helper rehash method used when keys can be non-unique.
- void _M_rehash_aux(size_type __bkt_count, false_type __uks);
-
- // Unconditionally change size of bucket array to n, restore
- // hash policy state to __state on exception.
- void _M_rehash(size_type __bkt_count, const __rehash_state& __state);
+ void _M_rehash(size_type __bkt_count, false_type __uks);
};
// Definitions of class template _Hashtable's out-of-line member functions.
@@ -1337,7 +1334,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
__buckets_ptr __former_buckets = nullptr;
std::size_t __former_bucket_count = _M_bucket_count;
- const __rehash_state& __former_state = _M_rehash_policy._M_state();
+ __rehash_guard_t __rehash_guard(_M_rehash_policy);
if (_M_bucket_count != __ht._M_bucket_count)
{
@@ -1359,6 +1356,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_assign(std::forward<_Ht>(__ht), __roan);
if (__former_buckets)
_M_deallocate_buckets(__former_buckets, __former_bucket_count);
+ __rehash_guard._M_guarded_obj = nullptr;
}
__catch(...)
{
@@ -1366,7 +1364,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
// Restore previous buckets.
_M_deallocate_buckets();
- _M_rehash_policy._M_reset(__former_state);
_M_buckets = __former_buckets;
_M_bucket_count = __former_bucket_count;
}
@@ -2142,17 +2139,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__node_ptr __node, size_type __n_elt)
-> iterator
{
- const __rehash_state& __saved_state = _M_rehash_policy._M_state();
+ __rehash_guard_t __rehash_guard(_M_rehash_policy);
std::pair<bool, std::size_t> __do_rehash
= _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_count,
__n_elt);
if (__do_rehash.first)
{
- _M_rehash(__do_rehash.second, __saved_state);
+ _M_rehash(__do_rehash.second, true_type{});
__bkt = _M_bucket_index(__code);
}
+ __rehash_guard._M_guarded_obj = nullptr;
this->_M_store_code(*__node, __code);
// Always insert at the beginning of the bucket.
@@ -2172,13 +2170,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__hash_code __code, __node_ptr __node)
-> iterator
{
- const __rehash_state& __saved_state = _M_rehash_policy._M_state();
+ __rehash_guard_t __rehash_guard(_M_rehash_policy);
std::pair<bool, std::size_t> __do_rehash
= _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_count, 1);
if (__do_rehash.first)
- _M_rehash(__do_rehash.second, __saved_state);
+ _M_rehash(__do_rehash.second, false_type{});
+ __rehash_guard._M_guarded_obj = nullptr;
this->_M_store_code(*__node, __code);
const key_type& __k = _ExtractKey{}(__node->_M_v());
size_type __bkt = _M_bucket_index(__code);
@@ -2509,39 +2508,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
rehash(size_type __bkt_count)
{
- const __rehash_state& __saved_state = _M_rehash_policy._M_state();
+ __rehash_guard_t __rehash_guard(_M_rehash_policy);
__bkt_count
= std::max(_M_rehash_policy._M_bkt_for_elements(_M_element_count + 1),
__bkt_count);
__bkt_count = _M_rehash_policy._M_next_bkt(__bkt_count);
if (__bkt_count != _M_bucket_count)
- _M_rehash(__bkt_count, __saved_state);
- else
- // No rehash, restore previous state to keep it consistent with
- // container state.
- _M_rehash_policy._M_reset(__saved_state);
- }
-
- template<typename _Key, typename _Value, typename _Alloc,
- typename _ExtractKey, typename _Equal,
- typename _Hash, typename _RangeHash, typename _Unused,
- typename _RehashPolicy, typename _Traits>
- void
- _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
- _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_rehash(size_type __bkt_count, const __rehash_state& __state)
- {
- __try
- {
- _M_rehash_aux(__bkt_count, __unique_keys{});
- }
- __catch(...)
{
- // A failure here means that buckets allocation failed. We only
- // have to restore hash policy previous state.
- _M_rehash_policy._M_reset(__state);
- __throw_exception_again;
+ _M_rehash(__bkt_count, __unique_keys{});
+ __rehash_guard._M_guarded_obj = nullptr;
}
}
@@ -2553,7 +2529,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
void
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_rehash_aux(size_type __bkt_count, true_type /* __uks */)
+ _M_rehash(size_type __bkt_count, true_type /* __uks */)
{
__buckets_ptr __new_buckets = _M_allocate_buckets(__bkt_count);
__node_ptr __p = _M_begin();
@@ -2596,7 +2572,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
void
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
- _M_rehash_aux(size_type __bkt_count, false_type /* __uks */)
+ _M_rehash(size_type __bkt_count, false_type /* __uks */)
{
__buckets_ptr __new_buckets = _M_allocate_buckets(__bkt_count);
__node_ptr __p = _M_begin();
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h
b/libstdc++-v3/include/bits/hashtable_policy.h
index 5d162463dc3..5b97b24e156 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -715,6 +715,25 @@ namespace __detail
std::size_t _M_next_resize;
};
+ template<typename _RehashPolicy>
+ struct _RehashStateGuard
+ {
+ _RehashPolicy* _M_guarded_obj;
+ typename _RehashPolicy::_State _M_prev_state;
+
+ _RehashStateGuard(_RehashPolicy& __policy)
+ : _M_guarded_obj(std::__addressof(__policy))
+ , _M_prev_state(__policy._M_state())
+ { }
+ _RehashStateGuard(const _RehashStateGuard&) = delete;
+
+ ~_RehashStateGuard()
+ {
+ if (_M_guarded_obj)
+ _M_guarded_obj->_M_reset(_M_prev_state);
+ }
+ };
+
// Base classes for std::_Hashtable. We define these base classes
// because in some cases we want to do different things depending on
// the value of a policy class. In some cases the policy class
@@ -1006,24 +1025,24 @@ namespace __detail
_M_insert_range(_InputIterator __first, _InputIterator __last,
const _NodeGetter& __node_gen, false_type __uks)
{
- using __rehash_type = typename __hashtable::__rehash_type;
- using __rehash_state = typename __hashtable::__rehash_state;
- using pair_type = std::pair<bool, std::size_t>;
+ using __rehash_guard_t = typename __hashtable::__rehash_guard_t;
+ using __pair_type = std::pair<bool, std::size_t>;
size_type __n_elt = __detail::__distance_fw(__first, __last);
if (__n_elt == 0)
return;
__hashtable& __h = _M_conjure_hashtable();
- __rehash_type& __rehash = __h._M_rehash_policy;
- const __rehash_state& __saved_state = __rehash._M_state();
- pair_type __do_rehash = __rehash._M_need_rehash(__h._M_bucket_count,
- __h._M_element_count,
- __n_elt);
+ __rehash_guard_t __rehash_guard(__h._M_rehash_policy);
+ __pair_type __do_rehash
+ = __h._M_rehash_policy._M_need_rehash(__h._M_bucket_count,
+ __h._M_element_count,
+ __n_elt);
if (__do_rehash.first)
- __h._M_rehash(__do_rehash.second, __saved_state);
+ __h._M_rehash(__do_rehash.second, __uks);
+ __rehash_guard._M_guarded_obj = nullptr;
for (; __first != __last; ++__first)
__h._M_insert(*__first, __node_gen, __uks);
}