Hi
I eventually would like to propose the attached patch.
In tr1 I made sure we use a special past-the-end iterator that
makes usage of lower_bound result without check safe.
PR libstdc++/71181
* include/tr1/hashtable_policy.h
(_Prime_rehash_policy::_M_next_bkt): Make past-the-end iterator
dereferenceable to avoid check on lower_bound result.
(_Prime_rehash_policy::_M_bkt_for_elements): Call latter.
(_Prime_rehash_policy::_M_need_rehash): Likewise.
* src/c++11/hashtable_c++0x.cc (_Prime_rehash_policy::_M_next_bkt):
Always return a value greater than input value. Set _M_next_resize to
max value when reaching highest prime number.
* src/shared/hashtable-aux.cc (__prime_list): Add comment that sentinel
is useless.
* testsuite/23_containers/unordered_set/hash_policy/71181.cc: New.
*
testsuite/23_containers/unordered_set/hash_policy/prime_rehash.cc: New.
* testsuite/23_containers/unordered_set/hash_policy/rehash.cc:
Fix indentation.
Tested under Linux x86_64.
François
On 25/05/2016 22:48, François Dumont wrote:
On 25/05/2016 16:01, Jonathan Wakely wrote:
On 22/05/16 17:16 +0200, François Dumont wrote:
Hi
To fix 71181 problem I propose to change how we deal with reserve
called with pivot values that is to say prime numbers. Now
_M_next_bkt always return a value higher than the input value. This
way when reverse(97) is called we end up with 199 buckets and so
enough space to store 97 values without rehashing.
I have integrated in this patch several other enhancements on the
same subject. Improvement of _M_next_resize management when reaching
highest bucket number. Remove sentinel value in __prime_list, just
need to limit range when calling lower_bound.
I don't think the change to __prime_list is safe. If you compile some
code with GCC 5 and then used a libstdc++.so with this change the old
code would still be looking for the sentinel in the array, and would
not find it.
I think it would be safe to leave the old __prime_list unchanged (and
then not need to change anything in tr1/hashtable_policy.h?) and add a
new array with a different name. Existing code compiled with older
versions of GCC would still find __prime_list, but the new code would
use a different array.
What about this version ? tr1 mode still limit search range as it
should to make sure it doesn't need to check lower_bound result. And
sentinel is only kept for backward compatibility and commented to make
that clear. Maybe there is a clearer way to express that sentinel can
be removed on a future version breaking abi ?
François
diff --git a/libstdc++-v3/include/tr1/hashtable_policy.h b/libstdc++-v3/include/tr1/hashtable_policy.h
index 4ee6d45..24d1a59 100644
--- a/libstdc++-v3/include/tr1/hashtable_policy.h
+++ b/libstdc++-v3/include/tr1/hashtable_policy.h
@@ -420,8 +420,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_Prime_rehash_policy::
_M_next_bkt(std::size_t __n) const
{
- const unsigned long* __p = std::lower_bound(__prime_list, __prime_list
- + _S_n_primes, __n);
+ // Past-the-end iterator is made dereferenceable to avoid check on
+ // lower_bound result.
+ const unsigned long* __p
+ = std::lower_bound(__prime_list, __prime_list + _S_n_primes - 1, __n);
_M_next_resize =
static_cast<std::size_t>(__builtin_ceil(*__p * _M_max_load_factor));
return *__p;
@@ -434,11 +436,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_bkt_for_elements(std::size_t __n) const
{
const float __min_bkts = __n / _M_max_load_factor;
- const unsigned long* __p = std::lower_bound(__prime_list, __prime_list
- + _S_n_primes, __min_bkts);
- _M_next_resize =
- static_cast<std::size_t>(__builtin_ceil(*__p * _M_max_load_factor));
- return *__p;
+ return _M_next_bkt(__builtin_ceil(__min_bkts));
}
// Finds the smallest prime p such that alpha p > __n_elt + __n_ins.
@@ -462,12 +460,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if (__min_bkts > __n_bkt)
{
__min_bkts = std::max(__min_bkts, _M_growth_factor * __n_bkt);
- const unsigned long* __p =
- std::lower_bound(__prime_list, __prime_list + _S_n_primes,
- __min_bkts);
- _M_next_resize = static_cast<std::size_t>
- (__builtin_ceil(*__p * _M_max_load_factor));
- return std::make_pair(true, *__p);
+ return std::make_pair(true,
+ _M_next_bkt(__builtin_ceil(__min_bkts)));
}
else
{
diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index a5e6520..7cbd364 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,22 +46,36 @@ namespace __detail
{
// Optimize lookups involving the first elements of __prime_list.
// (useful to speed-up, eg, constructors)
- static const unsigned char __fast_bkt[12]
- = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
+ static const unsigned char __fast_bkt[13]
+ = { 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11, 13, 13 };
- if (__n <= 11)
+ if (__n <= 12)
{
_M_next_resize =
__builtin_ceil(__fast_bkt[__n] * (long double)_M_max_load_factor);
return __fast_bkt[__n];
}
+ // Number of primes without sentinel.
constexpr auto __n_primes
= sizeof(__prime_list) / sizeof(unsigned long) - 1;
+ // past-the-end iterator is made dereferenceable.
+ constexpr auto __prime_list_end = __prime_list + __n_primes - 1;
const unsigned long* __next_bkt =
- std::lower_bound(__prime_list + 5, __prime_list + __n_primes, __n);
+ std::lower_bound(__prime_list + 6, __prime_list_end, __n);
+
+ if (*__next_bkt == __n && __next_bkt != __prime_list_end)
+ ++__next_bkt;
+
+ if (__next_bkt == __prime_list_end)
+ // Set next resize to the max value so that we never try to rehash again
+ // as we already reach the biggest possible bucket number.
+ // Note that it might result in max_load_factor not being respected.
+ _M_next_resize = std::size_t(-1);
+ else
_M_next_resize =
__builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
+
return *__next_bkt;
}
diff --git a/libstdc++-v3/src/shared/hashtable-aux.cc b/libstdc++-v3/src/shared/hashtable-aux.cc
index 081bb12..ec9841e 100644
--- a/libstdc++-v3/src/shared/hashtable-aux.cc
+++ b/libstdc++-v3/src/shared/hashtable-aux.cc
@@ -25,6 +25,7 @@
namespace __detail
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION
+ // The sentinel value is kept only for abi backward compatibility.
extern const unsigned long __prime_list[] = // 256 + 1 or 256 + 48 + 1
{
2ul, 3ul, 5ul, 7ul, 11ul, 13ul, 17ul, 19ul, 23ul, 29ul, 31ul,
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/71181.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/71181.cc
new file mode 100644
index 0000000..e0c0259
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/71181.cc
@@ -0,0 +1,63 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3. If not see
+// <http://www.gnu.org/licenses/>.
+//
+// { dg-options "-std=gnu++11" }
+
+#include <unordered_set>
+
+#include <testsuite_hooks.h>
+
+template<typename _USet>
+ void test(std::size_t max)
+ {
+ bool test __attribute__((unused)) = true;
+ _USet us;
+ auto nb_reserved = us.bucket_count();
+ us.reserve(nb_reserved);
+ auto bkts = us.bucket_count();
+ for (int i = 0; i != max; ++i)
+ {
+ if (i == nb_reserved)
+ {
+ nb_reserved = bkts;
+ us.reserve(nb_reserved);
+ bkts = us.bucket_count();
+ }
+
+ us.insert(i);
+
+ VERIFY( us.bucket_count() == bkts );
+ }
+ }
+
+template<typename _Value>
+ using unordered_set_power2_rehash =
+ std::_Hashtable<_Value, _Value, std::allocator<_Value>,
+ std::__detail::_Identity,
+ std::equal_to<_Value>,
+ std::hash<_Value>,
+ std::__detail::_Mask_range_hashing,
+ std::__detail::_Default_ranged_hash,
+ std::__detail::_Power2_rehash_policy,
+ std::__detail::_Hashtable_traits<false, true, true>>;
+
+int main()
+{
+ test<std::unordered_set<int>>(150);
+ test<unordered_set_power2_rehash<int>>(150);
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/power2_rehash.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/power2_rehash.cc
index 1d01d0a..5805f7a 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/power2_rehash.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/power2_rehash.cc
@@ -17,6 +17,7 @@
//
// { dg-options "-std=gnu++11" }
+#include <limits>
#include <unordered_set>
#include <testsuite_hooks.h>
@@ -35,8 +36,32 @@ void test01()
== (std::size_t(1) << (sizeof(std::size_t) * 8 - 1)) );
}
+void test02()
+{
+ bool test __attribute__((unused)) = true;
+
+ std::__detail::_Power2_rehash_policy policy;
+
+ for (std::size_t i = 1;;)
+ {
+ auto nxt = policy._M_next_bkt(i);
+
+ if (nxt == i)
+ {
+ // Equals only when reaching max.
+ constexpr auto mx = std::numeric_limits<std::size_t>::max();
+ VERIFY( nxt == policy._M_next_bkt(mx) );
+ break;
+ }
+
+ VERIFY( nxt > i );
+ i = nxt;
+ }
+}
+
int main()
{
test01();
+ test02();
return 0;
}
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/prime_rehash.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/prime_rehash.cc
new file mode 100644
index 0000000..046dd2b
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/prime_rehash.cc
@@ -0,0 +1,52 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3. If not see
+// <http://www.gnu.org/licenses/>.
+//
+// { dg-options "-std=gnu++11" }
+
+#include <limits>
+#include <unordered_set>
+
+#include <testsuite_hooks.h>
+
+void test01()
+{
+ bool test __attribute__((unused)) = true;
+
+ std::__detail::_Prime_rehash_policy policy;
+
+ for (std::size_t i = 1;;)
+ {
+ auto nxt = policy._M_next_bkt(i);
+
+ if (nxt == i)
+ {
+ // Equals only when reaching max.
+ constexpr auto mx = std::numeric_limits<std::size_t>::max();
+ VERIFY( nxt == policy._M_next_bkt(mx) );
+ break;
+ }
+
+ VERIFY( nxt > i );
+ i = nxt;
+ }
+}
+
+int main()
+{
+ test01();
+ return 0;
+}
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/rehash.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/rehash.cc
index 2dac583..9658131 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/rehash.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/rehash.cc
@@ -34,8 +34,8 @@ void test()
us.insert(i);
if (bkt_count != us.bucket_count())
{
- // Container has been rehashed, lets check that it won't be rehash again
- // if we remove and restore the last 2 inserted elements:
+ // Container has been rehashed, lets check that it won't be rehash
+ // again if we remove and restore the last 2 inserted elements:
rehashed = true;
bkt_count = us.bucket_count();
VERIFY( us.erase(i) == 1 );