https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67922

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=96088

--- Comment #8 from Jonathan Wakely <redi at gcc dot gnu.org> ---
For the test cases in comment 3 I get these numbers today:

$ time ./test_erase-libstdc++

real    0m0.062s
user    0m0.046s
sys     0m0.015s

$ time ./test_erase-libstdc++

real    0m0.064s
user    0m0.041s
sys     0m0.023s

$ time ./test_clear-libstdc++

real    0m0.413s
user    0m0.400s
sys     0m0.012s

$ time ./test_clear-libc++

real    0m0.462s
user    0m0.441s
sys     0m0.019s


So clear() is about 6 times slower than erase(begin(), end()), not 30 times
slower. And libc++ is still a little slower than libstdc++.

We could add the optimization to do nothing if the container is already empty:

--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -2716,6 +2716,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
               _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
     clear() noexcept
     {
+      if (_M_element_count == 0)
+       return;
+
       this->_M_deallocate_nodes(_M_begin());
       std::fill_n(_M_buckets, _M_bucket_count, nullptr);
       _M_element_count = 0;

But that won't help for the almost-empty case.

We could look at making _M_deallocate_nodes check if each node is the last in a
bucket, and if so, set the bucket to nullptr. That would assign to N elements
of the array instead of memsetting the entire array, where N is the bucket
count.

But maybe we should make it dynamic, so that when size() is much larger than
bucket_count(), we just zero the entire array instead of adding an extra
conditional branch for every node in _M_deallocate_nodes. Because if the
container is not almost empty, or the bucket array is not sparsely populated,
then zeroing the whole thing is more efficient.

Reply via email to