Hi

Looks like a great finding to me, this is indeed a useless check, thanks!

Have you any figures on the performance enhancement ? It might help to get proper approval as gcc is currently in dev stage 4 that is to say only bug fixes normally.

François

On 17/01/2024 09:11, Huanghui Nie wrote:

Hi.

When I implemented a hash table with reference to the C++ STL, I found that when the hash table in the C++ STL deletes elements, if the first element deleted is the begin element, the before begin node is repeatedly assigned. This creates unnecessary performance overhead.


First, let’s see the code implementation:

In _M_remove_bucket_begin, _M_before_begin._M_nxt is assigned when &_M_before_begin == _M_buckets[__bkt]. That also means _M_buckets[__bkt]->_M_nxt is assigned under some conditions.

_M_remove_bucket_begin is called by _M_erase and _M_extract_node:

 1. Case _M_erase a range: _M_remove_bucket_begin is called in a for
    loop when __is_bucket_begin is true. And if __is_bucket_begin is
    true and &_M_before_begin == _M_buckets[__bkt], __prev_n must be
    &_M_before_begin. __prev_n->_M_nxt is always assigned in _M_erase.
    That means _M_before_begin._M_nxt is always assigned, if
    _M_remove_bucket_begin is called and &_M_before_begin ==
    _M_buckets[__bkt]. So there’s no need to assign
    _M_before_begin._M_nxt in _M_remove_bucket_begin.
 2. Other cases: _M_remove_bucket_begin is called when __prev_n ==
    _M_buckets[__bkt]. And __prev_n->_M_nxt is always assigned in
    _M_erase and _M_before_begin. That means _M_buckets[__bkt]->_M_nxt
    is always assigned. So there's no need to assign
    _M_buckets[__bkt]->_M_nxt in _M_remove_bucket_begin.

In summary, there’s no need to check &_M_before_begin == _M_buckets[__bkt] and assign _M_before_begin._M_nxt in _M_remove_bucket_begin.


Then let’s see the responsibility of each method:

The hash table in the C++ STL is composed of hash buckets and a node list. The update of the node list is responsible for _M_erase and _M_extract_node method. _M_remove_bucket_begin method only needs to update the hash buckets. The update of _M_before_begin belongs to the update of the node list. So _M_remove_bucket_begin doesn’t need to update _M_before_begin.


Existing tests listed below cover this change:

23_containers/unordered_set/allocator/copy.cc

23_containers/unordered_set/allocator/copy_assign.cc

23_containers/unordered_set/allocator/move.cc

23_containers/unordered_set/allocator/move_assign.cc

23_containers/unordered_set/allocator/swap.cc

23_containers/unordered_set/erase/1.cc

23_containers/unordered_set/erase/24061-set.cc

23_containers/unordered_set/modifiers/extract.cc

23_containers/unordered_set/operations/count.cc

23_containers/unordered_set/requirements/exception/basic.cc

23_containers/unordered_map/allocator/copy.cc

23_containers/unordered_map/allocator/copy_assign.cc

23_containers/unordered_map/allocator/move.cc

23_containers/unordered_map/allocator/move_assign.cc

23_containers/unordered_map/allocator/swap.cc

23_containers/unordered_map/erase/1.cc

23_containers/unordered_map/erase/24061-map.cc

23_containers/unordered_map/modifiers/extract.cc

23_containers/unordered_map/modifiers/move_assign.cc

23_containers/unordered_map/operations/count.cc

23_containers/unordered_map/requirements/exception/basic.cc


Regression tested on x86_64-pc-linux-gnu. Is it OK to commit?


---

ChangeLog:


libstdc++: hashtable: No need to update before begin node in _M_remove_bucket_begin


2024-01-16Huanghui Nie<nnn...@gmail.com>


gcc/

* libstdc++-v3/include/bits/hashtable.h


---


diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h

index b48610036fa..6056639e663 100644

--- a/libstdc++-v3/include/bits/hashtable.h

+++ b/libstdc++-v3/include/bits/hashtable.h

@@ -872,13 +872,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

      if (!__next_n || __next_bkt != __bkt)

        {

          // Bucket is now empty

-         // First update next bucket if any

+         // Update next bucket if any

          if (__next_n)

            _M_buckets[__next_bkt] = _M_buckets[__bkt];

-         // Second update before begin node if necessary

-         if (&_M_before_begin == _M_buckets[__bkt])

-           _M_before_begin._M_nxt = __next_n;

          _M_buckets[__bkt] = nullptr;

        }

    }

Reply via email to