Re: PR 71181 Avoid rehash after reserve

2016-06-17 Thread Jonathan Wakely

On 16/06/16 21:29 +0200, François Dumont wrote:

Here is a new version compiling all your feedbacks.

   PR libstdc++/71181
   * include/tr1/hashtable_policy.h
   (_Prime_rehash_policy::_M_next_bkt): Make past-the-end iterator
   dereferenceable to avoid check on lower_bound result.
   (_Prime_rehash_policy::_M_bkt_for_elements): Call latter.
   (_Prime_rehash_policy::_M_need_rehash): Likewise.
   * src/c++11/hashtable_c++0x.cc (_Prime_rehash_policy::_M_next_bkt):
   Always return a value greater than input value. Set _M_next_resize to
   max value when reaching highest prime number.
   * src/shared/hashtable-aux.cc (__prime_list): Add comment about 
sentinel

   being now useless.
   * testsuite/23_containers/unordered_set/hash_policy/71181.cc: New.
   * testsuite/23_containers/unordered_set/hash_policy/power2_rehash.cc
   (test02): New.
   * 
testsuite/23_containers/unordered_set/hash_policy/prime_rehash.cc: 
New.

   * testsuite/23_containers/unordered_set/hash_policy/rehash.cc:
   Fix indentation.


Great - OK for trunk, thanks.



On 15/06/2016 10:29, Jonathan Wakely wrote:

On 14/06/16 22:34 +0200, François Dumont wrote:

  const unsigned long* __next_bkt =
-  std::lower_bound(__prime_list + 5, __prime_list + 
__n_primes, __n);

+  std::lower_bound(__prime_list + 6, __prime_list_end, __n);
+
+if (*__next_bkt == __n && __next_bkt != __prime_list_end)
+  ++__next_bkt;


Can we avoid this check by searching for __n + 1 instead of __n with
the lower_bound call?


Yes, that's another option, I will give it a try.


I did some comparisons and this version seems to execute fewer
instructions in some simple tests, according to cachegrind.

The only drawback is that calling _M_next_bkt(size_t(-1)) doesn't give 
the right result. But reaching this kind of value is not likely to 
happen in real use cases so it is acceptable.


Yes, good point. I don't think we can ever have that many buckets,
because each bucket requires more than one byte, so we can't fit that
many buckets in memory anyway.



Re: PR 71181 Avoid rehash after reserve

2016-06-16 Thread François Dumont

Here is a new version compiling all your feedbacks.

PR libstdc++/71181
* include/tr1/hashtable_policy.h
(_Prime_rehash_policy::_M_next_bkt): Make past-the-end iterator
dereferenceable to avoid check on lower_bound result.
(_Prime_rehash_policy::_M_bkt_for_elements): Call latter.
(_Prime_rehash_policy::_M_need_rehash): Likewise.
* src/c++11/hashtable_c++0x.cc (_Prime_rehash_policy::_M_next_bkt):
Always return a value greater than input value. Set _M_next_resize to
max value when reaching highest prime number.
* src/shared/hashtable-aux.cc (__prime_list): Add comment about 
sentinel

being now useless.
* testsuite/23_containers/unordered_set/hash_policy/71181.cc: New.
* testsuite/23_containers/unordered_set/hash_policy/power2_rehash.cc
(test02): New.
* 
testsuite/23_containers/unordered_set/hash_policy/prime_rehash.cc: New.

* testsuite/23_containers/unordered_set/hash_policy/rehash.cc:
Fix indentation.


On 15/06/2016 10:29, Jonathan Wakely wrote:

On 14/06/16 22:34 +0200, François Dumont wrote:

   const unsigned long* __next_bkt =
-  std::lower_bound(__prime_list + 5, __prime_list + 
__n_primes, __n);

+  std::lower_bound(__prime_list + 6, __prime_list_end, __n);
+
+if (*__next_bkt == __n && __next_bkt != __prime_list_end)
+  ++__next_bkt;


Can we avoid this check by searching for __n + 1 instead of __n with
the lower_bound call?


Yes, that's another option, I will give it a try.


I did some comparisons and this version seems to execute fewer
instructions in some simple tests, according to cachegrind.

The only drawback is that calling _M_next_bkt(size_t(-1)) doesn't give 
the right result. But reaching this kind of value is not likely to 
happen in real use cases so it is acceptable.


Tested under Linux x86_64.

François

diff --git a/libstdc++-v3/include/tr1/hashtable_policy.h b/libstdc++-v3/include/tr1/hashtable_policy.h
index 4ee6d45..c5cf866 100644
--- a/libstdc++-v3/include/tr1/hashtable_policy.h
+++ b/libstdc++-v3/include/tr1/hashtable_policy.h
@@ -420,8 +420,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Prime_rehash_policy::
   _M_next_bkt(std::size_t __n) const
   {
-const unsigned long* __p = std::lower_bound(__prime_list, __prime_list
-		+ _S_n_primes, __n);
+// Don't include the last prime in the search, so that anything
+// higher than the second-to-last prime returns a past-the-end
+// iterator that can be dereferenced to get the last prime.
+const unsigned long* __p
+  = std::lower_bound(__prime_list, __prime_list + _S_n_primes - 1, __n);
 _M_next_resize = 
   static_cast(__builtin_ceil(*__p * _M_max_load_factor));
 return *__p;
@@ -434,11 +437,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_bkt_for_elements(std::size_t __n) const
   {
 const float __min_bkts = __n / _M_max_load_factor;
-const unsigned long* __p = std::lower_bound(__prime_list, __prime_list
-		+ _S_n_primes, __min_bkts);
-_M_next_resize =
-  static_cast(__builtin_ceil(*__p * _M_max_load_factor));
-return *__p;
+return _M_next_bkt(__builtin_ceil(__min_bkts));
   }
 
   // Finds the smallest prime p such that alpha p > __n_elt + __n_ins.
@@ -462,12 +461,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	if (__min_bkts > __n_bkt)
 	  {
 	__min_bkts = std::max(__min_bkts, _M_growth_factor * __n_bkt);
-	const unsigned long* __p =
-	  std::lower_bound(__prime_list, __prime_list + _S_n_primes,
-			   __min_bkts);
-	_M_next_resize = static_cast
-	  (__builtin_ceil(*__p * _M_max_load_factor));
-	return std::make_pair(true, *__p);
+	return std::make_pair(true,
+  _M_next_bkt(__builtin_ceil(__min_bkts)));
 	  }
 	else 
 	  {
diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index a5e6520..ce4961f 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,22 +46,38 @@ namespace __detail
   {
 // Optimize lookups involving the first elements of __prime_list.
 // (useful to speed-up, eg, constructors)
-static const unsigned char __fast_bkt[12]
-  = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
+static const unsigned char __fast_bkt[13]
+  = { 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11, 13, 13 };
 
-if (__n <= 11)
+if (__n <= 12)
   {
 	_M_next_resize =
 	  __builtin_ceil(__fast_bkt[__n] * (long double)_M_max_load_factor);
 	return __fast_bkt[__n];
   }
 
+// Number of primes (without sentinel).
 constexpr auto __n_primes
   = sizeof(__prime_list) / sizeof(unsigned long) - 1;
+
+// Don't include the last prime in the search, so that anything
+// higher than the second-to-last prime returns a past-the-end
+// iterator that can be dereferenced to get the last prime.
+constexpr auto __last_prime = __prime_list + __n_primes - 1;
+
+// Look for 'n + 1' to make sure returned value will be greater than n.
 co

Re: PR 71181 Avoid rehash after reserve

2016-06-15 Thread Jonathan Wakely

On 14/06/16 22:34 +0200, François Dumont wrote:

On 14/06/2016 13:22, Jonathan Wakely wrote:

On 13/06/16 21:49 +0200, François Dumont wrote:

Hi

  I eventually would like to propose the attached patch.

  In tr1 I made sure we use a special past-the-end iterator that 
makes usage of lower_bound result without check safe.


I'm confused ... isn't that already done?


Indeed but my intention was to make sentinel values useless so that we 
can remove them one day.


I don't like current code because when you just look at lower_bound 
call you can wonder why returned value is not tested. You need to 
consider how __prime_list has been defined. When you add '- 1' in the 
call to lower_bound you don't need to look too far to understand it.




_S_n_primes is defined as:

  enum { _S_n_primes = sizeof(unsigned long) != 8 ? 256 : 256 + 48 };

The table of primes is:

extern const unsigned long __prime_list[] = // 256 + 1 or 256 + 48 + 1

Which means that _S_n_primes is already one less, so that the "end"
returned by lower_bound is already dereferenceable. That's what the
comment in the table suggests too:

  // Sentinel, so we don't have to test the result of lower_bound,
  // or, on 64-bit machines, rest of the table.
#if __SIZEOF_LONG__ != 8
  4294967291ul

So ...

diff --git a/libstdc++-v3/include/tr1/hashtable_policy.h 
b/libstdc++-v3/include/tr1/hashtable_policy.h

index 4ee6d45..24d1a59 100644
--- a/libstdc++-v3/include/tr1/hashtable_policy.h
+++ b/libstdc++-v3/include/tr1/hashtable_policy.h
@@ -420,8 +420,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _Prime_rehash_policy::
 _M_next_bkt(std::size_t __n) const
 {
-const unsigned long* __p = std::lower_bound(__prime_list, 
__prime_list

-+ _S_n_primes, __n);
+// Past-the-end iterator is made dereferenceable to avoid check on
+// lower_bound result.
+const unsigned long* __p
+  = std::lower_bound(__prime_list, __prime_list + _S_n_primes 
- 1, __n);


Is this redundant? Unless I'm misunderstanding something, _S_n_primes
already handles this.


Yes it does for now but not if __prime_list is a the pure list of 
prime numbers.


OK. And as I said below, lower_bound(primes, primes + nprimes - 1, n)
still works because anything greater than the second-to-last prime
should be treated as the last one anyway.

Would this comment make it clearer?

 // Don't include the last prime in the search, so that anything
 // higher than the second-to-last prime returns a past-the-end
 // iterator that can be dereferenced to get the last prime.
 const unsigned long* __p
   = std::lower_bound(__prime_list, __prime_list + _S_n_primes - 1, __n)




The other changes in tr1/hashtable_policy.h are nice simplifications.

diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc 
b/libstdc++-v3/src/c++11/hashtable_c++0x.cc

index a5e6520..7cbd364 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,22 +46,36 @@ namespace __detail
 {
   // Optimize lookups involving the first elements of __prime_list.
   // (useful to speed-up, eg, constructors)
-static const unsigned char __fast_bkt[12]
-  = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
+static const unsigned char __fast_bkt[13]
+  = { 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11, 13, 13 };

-if (__n <= 11)
+if (__n <= 12)
 {
   _M_next_resize =
 __builtin_ceil(__fast_bkt[__n] * (long double)_M_max_load_factor);
   return __fast_bkt[__n];
 }

+// Number of primes without sentinel.
   constexpr auto __n_primes
 = sizeof(__prime_list) / sizeof(unsigned long) - 1;
+// past-the-end iterator is made dereferenceable.
+constexpr auto __prime_list_end = __prime_list + __n_primes - 1;


I don't think this comment clarifies things very well.

Because of the sentinel and because __n_primes doesn't include the
sentinel, (__prime_list + __n_primes) is already dereferenceable
anyway, so the comment doesn't explain why there's *another* -1 here.


The comment is doing as if there was no sentinel.


OK. I think a similar comment as suggested above could help, by being
more verbose about what's happening.

 // Don't include the last prime in the search, so that anything
 // higher than the second-to-last prime returns a past-the-end
 // iterator that can be dereferenced to get the last prime.





   const unsigned long* __next_bkt =
-  std::lower_bound(__prime_list + 5, __prime_list + 
__n_primes, __n);

+  std::lower_bound(__prime_list + 6, __prime_list_end, __n);
+
+if (*__next_bkt == __n && __next_bkt != __prime_list_end)
+  ++__next_bkt;


Can we avoid this check by searching for __n + 1 instead of __n with
the lower_bound call?


Yes, that's another option, I will give it a try.


I did some comparisons and this version seems to execute fewer
instructions in some simple tests, according to cachegrind.



Re: PR 71181 Avoid rehash after reserve

2016-06-14 Thread François Dumont

On 14/06/2016 13:22, Jonathan Wakely wrote:

On 13/06/16 21:49 +0200, François Dumont wrote:

Hi

   I eventually would like to propose the attached patch.

   In tr1 I made sure we use a special past-the-end iterator that 
makes usage of lower_bound result without check safe.


I'm confused ... isn't that already done?


Indeed but my intention was to make sentinel values useless so that we 
can remove them one day.


I don't like current code because when you just look at lower_bound call 
you can wonder why returned value is not tested. You need to consider 
how __prime_list has been defined. When you add '- 1' in the call to 
lower_bound you don't need to look too far to understand it.




_S_n_primes is defined as:

   enum { _S_n_primes = sizeof(unsigned long) != 8 ? 256 : 256 + 48 };

The table of primes is:

 extern const unsigned long __prime_list[] = // 256 + 1 or 256 + 48 + 1

Which means that _S_n_primes is already one less, so that the "end"
returned by lower_bound is already dereferenceable. That's what the
comment in the table suggests too:

   // Sentinel, so we don't have to test the result of lower_bound,
   // or, on 64-bit machines, rest of the table.
#if __SIZEOF_LONG__ != 8
   4294967291ul

So ...

diff --git a/libstdc++-v3/include/tr1/hashtable_policy.h 
b/libstdc++-v3/include/tr1/hashtable_policy.h

index 4ee6d45..24d1a59 100644
--- a/libstdc++-v3/include/tr1/hashtable_policy.h
+++ b/libstdc++-v3/include/tr1/hashtable_policy.h
@@ -420,8 +420,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _Prime_rehash_policy::
  _M_next_bkt(std::size_t __n) const
  {
-const unsigned long* __p = std::lower_bound(__prime_list, 
__prime_list

-+ _S_n_primes, __n);
+// Past-the-end iterator is made dereferenceable to avoid check on
+// lower_bound result.
+const unsigned long* __p
+  = std::lower_bound(__prime_list, __prime_list + _S_n_primes - 
1, __n);


Is this redundant? Unless I'm misunderstanding something, _S_n_primes
already handles this.


Yes it does for now but not if __prime_list is a the pure list of prime 
numbers.


The other changes in tr1/hashtable_policy.h are nice simplifications.

diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc 
b/libstdc++-v3/src/c++11/hashtable_c++0x.cc

index a5e6520..7cbd364 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,22 +46,36 @@ namespace __detail
  {
// Optimize lookups involving the first elements of __prime_list.
// (useful to speed-up, eg, constructors)
-static const unsigned char __fast_bkt[12]
-  = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
+static const unsigned char __fast_bkt[13]
+  = { 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11, 13, 13 };

-if (__n <= 11)
+if (__n <= 12)
  {
_M_next_resize =
  __builtin_ceil(__fast_bkt[__n] * (long double)_M_max_load_factor);
return __fast_bkt[__n];
  }

+// Number of primes without sentinel.
constexpr auto __n_primes
  = sizeof(__prime_list) / sizeof(unsigned long) - 1;
+// past-the-end iterator is made dereferenceable.
+constexpr auto __prime_list_end = __prime_list + __n_primes - 1;


I don't think this comment clarifies things very well.

Because of the sentinel and because __n_primes doesn't include the
sentinel, (__prime_list + __n_primes) is already dereferenceable
anyway, so the comment doesn't explain why there's *another* -1 here.


The comment is doing as if there was no sentinel.





const unsigned long* __next_bkt =
-  std::lower_bound(__prime_list + 5, __prime_list + __n_primes, 
__n);

+  std::lower_bound(__prime_list + 6, __prime_list_end, __n);
+
+if (*__next_bkt == __n && __next_bkt != __prime_list_end)
+  ++__next_bkt;


Can we avoid this check by searching for __n + 1 instead of __n with
the lower_bound call?


Yes, that's another option, I will give it a try.


If I understand the logic correctly we can do it like this:

   // Number of primes without sentinel:
   constexpr auto __n_primes
 = sizeof(__prime_list) / sizeof(unsigned long) - 1;
   // The greatest prime in the table:
   constexpr auto __prime_list_end = __prime_list + __n_primes - 1;
   const auto __next_bkt =
 std::lower_bound(__prime_list + 6, __prime_list_end, __n + 1);
   if (__next_bkt == __prime_list_end)
 _M_next_resize = size_t(-1); // Reached maximum bucket count.
   else
 _M_next_resize =
__builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
   return *__next_bkt;

i.e.

- Ignore the sentinel (keeping it only for backward compatibility).

- Search for __n + 1 so we find the *next* bucket count.

- Don't include the largest prime in the search, because even if __n >
 largest prime, we're going to use that largest value anyway, so:

- if __n >= second largest prime then lower_bound will return the end
 iterator, which points to the largest prime.

Does this behave correctly?

Yes, it should, will run te

Re: PR 71181 Avoid rehash after reserve

2016-06-14 Thread Jonathan Wakely

diff --git 
a/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/71181.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/71181.cc
new file mode 100644
index 000..e0c0259
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/hash_policy/71181.cc
@@ -0,0 +1,63 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+//
+// { dg-options "-std=gnu++11" }
+
+#include 
+
+#include 
+
+template
+  void test(std::size_t max)
+  {


Also it looks like this test could just use int here, not size_t,
because the loop variable is an int (and so it avoids a -Wsign-compare
warning when compiling the test outside the testsuite).



Re: PR 71181 Avoid rehash after reserve

2016-06-14 Thread Jonathan Wakely

On 13/06/16 21:49 +0200, François Dumont wrote:

Hi

   I eventually would like to propose the attached patch.

   In tr1 I made sure we use a special past-the-end iterator that 
makes usage of lower_bound result without check safe.


I'm confused ... isn't that already done?

_S_n_primes is defined as:

   enum { _S_n_primes = sizeof(unsigned long) != 8 ? 256 : 256 + 48 };

The table of primes is:

 extern const unsigned long __prime_list[] = // 256 + 1 or 256 + 48 + 1

Which means that _S_n_primes is already one less, so that the "end"
returned by lower_bound is already dereferenceable. That's what the
comment in the table suggests too:

   // Sentinel, so we don't have to test the result of lower_bound,
   // or, on 64-bit machines, rest of the table.
#if __SIZEOF_LONG__ != 8
   4294967291ul

So ...


diff --git a/libstdc++-v3/include/tr1/hashtable_policy.h 
b/libstdc++-v3/include/tr1/hashtable_policy.h
index 4ee6d45..24d1a59 100644
--- a/libstdc++-v3/include/tr1/hashtable_policy.h
+++ b/libstdc++-v3/include/tr1/hashtable_policy.h
@@ -420,8 +420,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _Prime_rehash_policy::
  _M_next_bkt(std::size_t __n) const
  {
-const unsigned long* __p = std::lower_bound(__prime_list, __prime_list
-   + _S_n_primes, __n);
+// Past-the-end iterator is made dereferenceable to avoid check on
+// lower_bound result.
+const unsigned long* __p
+  = std::lower_bound(__prime_list, __prime_list + _S_n_primes - 1, __n);


Is this redundant? Unless I'm misunderstanding something, _S_n_primes
already handles this.

The other changes in tr1/hashtable_policy.h are nice simplifications.


diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc 
b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index a5e6520..7cbd364 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,22 +46,36 @@ namespace __detail
  {
// Optimize lookups involving the first elements of __prime_list.
// (useful to speed-up, eg, constructors)
-static const unsigned char __fast_bkt[12]
-  = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
+static const unsigned char __fast_bkt[13]
+  = { 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11, 13, 13 };

-if (__n <= 11)
+if (__n <= 12)
  {
_M_next_resize =
  __builtin_ceil(__fast_bkt[__n] * (long double)_M_max_load_factor);
return __fast_bkt[__n];
  }

+// Number of primes without sentinel.
constexpr auto __n_primes
  = sizeof(__prime_list) / sizeof(unsigned long) - 1;
+// past-the-end iterator is made dereferenceable.
+constexpr auto __prime_list_end = __prime_list + __n_primes - 1;


I don't think this comment clarifies things very well.

Because of the sentinel and because __n_primes doesn't include the
sentinel, (__prime_list + __n_primes) is already dereferenceable
anyway, so the comment doesn't explain why there's *another* -1 here.



const unsigned long* __next_bkt =
-  std::lower_bound(__prime_list + 5, __prime_list + __n_primes, __n);
+  std::lower_bound(__prime_list + 6, __prime_list_end, __n);
+
+if (*__next_bkt == __n && __next_bkt != __prime_list_end)
+  ++__next_bkt;


Can we avoid this check by searching for __n + 1 instead of __n with
the lower_bound call?

If I understand the logic correctly we can do it like this:

   // Number of primes without sentinel:
   constexpr auto __n_primes
 = sizeof(__prime_list) / sizeof(unsigned long) - 1;
   // The greatest prime in the table:
   constexpr auto __prime_list_end = __prime_list + __n_primes - 1;
   const auto __next_bkt =
 std::lower_bound(__prime_list + 6, __prime_list_end, __n + 1);
   if (__next_bkt == __prime_list_end)
 _M_next_resize = size_t(-1); // Reached maximum bucket count.
   else
 _M_next_resize =
__builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
   return *__next_bkt;

i.e.

- Ignore the sentinel (keeping it only for backward compatibility).

- Search for __n + 1 so we find the *next* bucket count.

- Don't include the largest prime in the search, because even if __n >
 largest prime, we're going to use that largest value anyway, so:

- if __n >= second largest prime then lower_bound will return the end
 iterator, which points to the largest prime.

Does this behave correctly?

I'd really like to see it tested for the boundary conditions, i.e.
verify that _Prime_rehash_policy::_M_next_bkt behaves as expected when
passed prime numbers, and when passed N-1, N and N+1 where N is the
largest prime in the table.


+if (__next_bkt == __prime_list_end)
+  // Set next resize to the max value so that we never try to rehash again
+  // as we already reach the biggest possible bucket number.
+  // Note that it might result in max_load_factor not being respected.
+  _M_next_resize = std::size_t(-1);
+else
  _M_next_resize =
__builtin_ceil(*__next_bkt * 

Re: PR 71181 Avoid rehash after reserve

2016-06-13 Thread François Dumont

Hi

I eventually would like to propose the attached patch.

In tr1 I made sure we use a special past-the-end iterator that 
makes usage of lower_bound result without check safe.


PR libstdc++/71181
* include/tr1/hashtable_policy.h
(_Prime_rehash_policy::_M_next_bkt): Make past-the-end iterator
dereferenceable to avoid check on lower_bound result.
(_Prime_rehash_policy::_M_bkt_for_elements): Call latter.
(_Prime_rehash_policy::_M_need_rehash): Likewise.
* src/c++11/hashtable_c++0x.cc (_Prime_rehash_policy::_M_next_bkt):
Always return a value greater than input value. Set _M_next_resize to
max value when reaching highest prime number.
* src/shared/hashtable-aux.cc (__prime_list): Add comment that sentinel
is useless.
* testsuite/23_containers/unordered_set/hash_policy/71181.cc: New.
* 
testsuite/23_containers/unordered_set/hash_policy/prime_rehash.cc: New.

* testsuite/23_containers/unordered_set/hash_policy/rehash.cc:
Fix indentation.

Tested under Linux x86_64.

François


On 25/05/2016 22:48, François Dumont wrote:

On 25/05/2016 16:01, Jonathan Wakely wrote:

On 22/05/16 17:16 +0200, François Dumont wrote:

Hi

   To fix 71181 problem I propose to change how we deal with reserve 
called with pivot values that is to say prime numbers. Now 
_M_next_bkt always return a value higher than the input value. This 
way when reverse(97) is called we end up with 199 buckets and so 
enough space to store 97 values without rehashing.


   I have integrated in this patch several other enhancements on the 
same subject. Improvement of _M_next_resize management when reaching 
highest bucket number. Remove sentinel value in __prime_list, just 
need to limit range when calling lower_bound.


I don't think the change to __prime_list is safe. If you compile some
code with GCC 5 and then used a libstdc++.so with this change the old
code would still be looking for the sentinel in the array, and would
not find it.

I think it would be safe to leave the old __prime_list unchanged (and
then not need to change anything in tr1/hashtable_policy.h?) and add a
new array with a different name. Existing code compiled with older
versions of GCC would still find __prime_list, but the new code would
use a different array.




What about this version ? tr1 mode still limit search range as it 
should to make sure it doesn't need to check lower_bound result. And 
sentinel is only kept for backward compatibility and commented to make 
that clear. Maybe there is a clearer way to express that sentinel can 
be removed on a future version breaking abi ?


François


diff --git a/libstdc++-v3/include/tr1/hashtable_policy.h b/libstdc++-v3/include/tr1/hashtable_policy.h
index 4ee6d45..24d1a59 100644
--- a/libstdc++-v3/include/tr1/hashtable_policy.h
+++ b/libstdc++-v3/include/tr1/hashtable_policy.h
@@ -420,8 +420,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Prime_rehash_policy::
   _M_next_bkt(std::size_t __n) const
   {
-const unsigned long* __p = std::lower_bound(__prime_list, __prime_list
-		+ _S_n_primes, __n);
+// Past-the-end iterator is made dereferenceable to avoid check on
+// lower_bound result.
+const unsigned long* __p
+  = std::lower_bound(__prime_list, __prime_list + _S_n_primes - 1, __n);
 _M_next_resize = 
   static_cast(__builtin_ceil(*__p * _M_max_load_factor));
 return *__p;
@@ -434,11 +436,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_bkt_for_elements(std::size_t __n) const
   {
 const float __min_bkts = __n / _M_max_load_factor;
-const unsigned long* __p = std::lower_bound(__prime_list, __prime_list
-		+ _S_n_primes, __min_bkts);
-_M_next_resize =
-  static_cast(__builtin_ceil(*__p * _M_max_load_factor));
-return *__p;
+return _M_next_bkt(__builtin_ceil(__min_bkts));
   }
 
   // Finds the smallest prime p such that alpha p > __n_elt + __n_ins.
@@ -462,12 +460,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	if (__min_bkts > __n_bkt)
 	  {
 	__min_bkts = std::max(__min_bkts, _M_growth_factor * __n_bkt);
-	const unsigned long* __p =
-	  std::lower_bound(__prime_list, __prime_list + _S_n_primes,
-			   __min_bkts);
-	_M_next_resize = static_cast
-	  (__builtin_ceil(*__p * _M_max_load_factor));
-	return std::make_pair(true, *__p);
+	return std::make_pair(true,
+  _M_next_bkt(__builtin_ceil(__min_bkts)));
 	  }
 	else 
 	  {
diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index a5e6520..7cbd364 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,22 +46,36 @@ namespace __detail
   {
 // Optimize lookups involving the first elements of __prime_list.
 // (useful to speed-up, eg, constructors)
-static const unsigned char __fast_bkt[12]
-  = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
+static const unsigned char __fast_bkt[13]
+  = { 2, 2, 3, 5, 5, 7

Re: PR 71181 Avoid rehash after reserve

2016-05-25 Thread François Dumont

On 25/05/2016 16:01, Jonathan Wakely wrote:

On 22/05/16 17:16 +0200, François Dumont wrote:

Hi

   To fix 71181 problem I propose to change how we deal with reserve 
called with pivot values that is to say prime numbers. Now 
_M_next_bkt always return a value higher than the input value. This 
way when reverse(97) is called we end up with 199 buckets and so 
enough space to store 97 values without rehashing.


   I have integrated in this patch several other enhancements on the 
same subject. Improvement of _M_next_resize management when reaching 
highest bucket number. Remove sentinel value in __prime_list, just 
need to limit range when calling lower_bound.


I don't think the change to __prime_list is safe. If you compile some
code with GCC 5 and then used a libstdc++.so with this change the old
code would still be looking for the sentinel in the array, and would
not find it.

I think it would be safe to leave the old __prime_list unchanged (and
then not need to change anything in tr1/hashtable_policy.h?) and add a
new array with a different name. Existing code compiled with older
versions of GCC would still find __prime_list, but the new code would
use a different array.




What about this version ? tr1 mode still limit search range as it 
should to make sure it doesn't need to check lower_bound result. And 
sentinel is only kept for backward compatibility and commented to make 
that clear. Maybe there is a clearer way to express that sentinel can be 
removed on a future version breaking abi ?


François
diff --git a/libstdc++-v3/include/tr1/hashtable_policy.h b/libstdc++-v3/include/tr1/hashtable_policy.h
index 4ee6d45..2f70289 100644
--- a/libstdc++-v3/include/tr1/hashtable_policy.h
+++ b/libstdc++-v3/include/tr1/hashtable_policy.h
@@ -403,7 +403,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _M_need_rehash(std::size_t __n_bkt, std::size_t __n_elt,
 		   std::size_t __n_ins) const;
 
-enum { _S_n_primes = sizeof(unsigned long) != 8 ? 256 : 256 + 48 };
+// Number of primes minus 1 to avoid check on lower_bound result.
+enum { _S_n_primes = (sizeof(unsigned long) != 8 ? 256 : 256 + 48) - 1 };
 
 float_M_max_load_factor;
 float_M_growth_factor;
diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index a5e6520..dc0dab5 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,10 +46,10 @@ namespace __detail
   {
 // Optimize lookups involving the first elements of __prime_list.
 // (useful to speed-up, eg, constructors)
-static const unsigned char __fast_bkt[12]
-  = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
+static const unsigned char __fast_bkt[13]
+  = { 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11, 13, 13 };
 
-if (__n <= 11)
+if (__n <= 12)
   {
 	_M_next_resize =
 	  __builtin_ceil(__fast_bkt[__n] * (long double)_M_max_load_factor);
@@ -58,10 +58,22 @@ namespace __detail
 
 constexpr auto __n_primes
   = sizeof(__prime_list) / sizeof(unsigned long) - 1;
+constexpr auto __prime_list_end = __prime_list + __n_primes;
 const unsigned long* __next_bkt =
-  std::lower_bound(__prime_list + 5, __prime_list + __n_primes, __n);
-_M_next_resize =
-  __builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
+  std::lower_bound(__prime_list + 6, __prime_list_end, __n);
+
+if (*__next_bkt == __n && __next_bkt != __prime_list_end)
+  ++__next_bkt;
+
+if (__next_bkt == __prime_list_end)
+  // Set next resize to the max value so that we never try to rehash again
+  // as we already reach the biggest possible bucket number.
+  // Note that it might result in max_load_factor not being respected.
+  _M_next_resize = std::size_t(-1);
+else
+  _M_next_resize =
+	__builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
+
 return *__next_bkt;
   }
 
diff --git a/libstdc++-v3/src/shared/hashtable-aux.cc b/libstdc++-v3/src/shared/hashtable-aux.cc
index 081bb12..9cbca98 100644
--- a/libstdc++-v3/src/shared/hashtable-aux.cc
+++ b/libstdc++-v3/src/shared/hashtable-aux.cc
@@ -25,6 +25,7 @@
 namespace __detail
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
+  // The sentinel value is only kept for backward compatibility.
   extern const unsigned long __prime_list[] = // 256 + 1 or 256 + 48 + 1
   {
 2ul, 3ul, 5ul, 7ul, 11ul, 13ul, 17ul, 19ul, 23ul, 29ul, 31ul,


Re: PR 71181 Avoid rehash after reserve

2016-05-25 Thread Jonathan Wakely

On 22/05/16 17:16 +0200, François Dumont wrote:

Hi

   To fix 71181 problem I propose to change how we deal with reserve 
called with pivot values that is to say prime numbers. Now _M_next_bkt 
always return a value higher than the input value. This way when 
reverse(97) is called we end up with 199 buckets and so enough space 
to store 97 values without rehashing.


   I have integrated in this patch several other enhancements on the 
same subject. Improvement of _M_next_resize management when reaching 
highest bucket number. Remove sentinel value in __prime_list, just 
need to limit range when calling lower_bound.


I don't think the change to __prime_list is safe. If you compile some
code with GCC 5 and then used a libstdc++.so with this change the old
code would still be looking for the sentinel in the array, and would
not find it.

I think it would be safe to leave the old __prime_list unchanged (and
then not need to change anything in tr1/hashtable_policy.h?) and add a
new array with a different name. Existing code compiled with older
versions of GCC would still find __prime_list, but the new code would
use a different array.



PR 71181 Avoid rehash after reserve

2016-05-22 Thread François Dumont

Hi

To fix 71181 problem I propose to change how we deal with reserve 
called with pivot values that is to say prime numbers. Now _M_next_bkt 
always return a value higher than the input value. This way when 
reverse(97) is called we end up with 199 buckets and so enough space to 
store 97 values without rehashing.


I have integrated in this patch several other enhancements on the 
same subject. Improvement of _M_next_resize management when reaching 
highest bucket number. Remove sentinel value in __prime_list, just need 
to limit range when calling lower_bound.


PR libstdc++/71181
* include/tr1/hashtable_policy.h (_S_n_primes): Minus 1 to avoid check
on lower_bound call.
* src/c++11/hashtable_c++0x.cc (_Prime_rehash_policy::_M_next_bkt):
Always return a value greater than input value. Set _M_next_resize to
value when reaching highest prime number.
* src/shared/hashtable-aux.cc (__prime_list): Remove sentinel value.
* testsuite/23_containers/unordered_set/hash_policy/71181.cc: New.
* testsuite/23_containers/unordered_set/max_load_factor/robustness.cc:
Adapt.

Tested under Linux x86_64, ok to commit ?

François

diff --git a/libstdc++-v3/include/tr1/hashtable_policy.h b/libstdc++-v3/include/tr1/hashtable_policy.h
index 4ee6d45..67a1339 100644
--- a/libstdc++-v3/include/tr1/hashtable_policy.h
+++ b/libstdc++-v3/include/tr1/hashtable_policy.h
@@ -403,7 +403,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _M_need_rehash(std::size_t __n_bkt, std::size_t __n_elt,
 		   std::size_t __n_ins) const;
 
-enum { _S_n_primes = sizeof(unsigned long) != 8 ? 256 : 256 + 48 };
+enum { _S_n_primes = (sizeof(unsigned long) != 8 ? 256 : 256 + 48) - 1 };
 
 float_M_max_load_factor;
 float_M_growth_factor;
diff --git a/libstdc++-v3/src/c++11/hashtable_c++0x.cc b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
index a5e6520..dc0dab5 100644
--- a/libstdc++-v3/src/c++11/hashtable_c++0x.cc
+++ b/libstdc++-v3/src/c++11/hashtable_c++0x.cc
@@ -46,10 +46,10 @@ namespace __detail
   {
 // Optimize lookups involving the first elements of __prime_list.
 // (useful to speed-up, eg, constructors)
-static const unsigned char __fast_bkt[12]
-  = { 2, 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11 };
+static const unsigned char __fast_bkt[13]
+  = { 2, 2, 3, 5, 5, 7, 7, 11, 11, 11, 11, 13, 13 };
 
-if (__n <= 11)
+if (__n <= 12)
   {
 	_M_next_resize =
 	  __builtin_ceil(__fast_bkt[__n] * (long double)_M_max_load_factor);
@@ -58,10 +58,22 @@ namespace __detail
 
 constexpr auto __n_primes
   = sizeof(__prime_list) / sizeof(unsigned long) - 1;
+constexpr auto __prime_list_end = __prime_list + __n_primes;
 const unsigned long* __next_bkt =
-  std::lower_bound(__prime_list + 5, __prime_list + __n_primes, __n);
-_M_next_resize =
-  __builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
+  std::lower_bound(__prime_list + 6, __prime_list_end, __n);
+
+if (*__next_bkt == __n && __next_bkt != __prime_list_end)
+  ++__next_bkt;
+
+if (__next_bkt == __prime_list_end)
+  // Set next resize to the max value so that we never try to rehash again
+  // as we already reach the biggest possible bucket number.
+  // Note that it might result in max_load_factor not being respected.
+  _M_next_resize = std::size_t(-1);
+else
+  _M_next_resize =
+	__builtin_ceil(*__next_bkt * (long double)_M_max_load_factor);
+
 return *__next_bkt;
   }
 
diff --git a/libstdc++-v3/src/shared/hashtable-aux.cc b/libstdc++-v3/src/shared/hashtable-aux.cc
index 081bb12..7623b46 100644
--- a/libstdc++-v3/src/shared/hashtable-aux.cc
+++ b/libstdc++-v3/src/shared/hashtable-aux.cc
@@ -25,7 +25,7 @@
 namespace __detail
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
-  extern const unsigned long __prime_list[] = // 256 + 1 or 256 + 48 + 1
+  extern const unsigned long __prime_list[] = // 256 or 256 + 48
   {
 2ul, 3ul, 5ul, 7ul, 11ul, 13ul, 17ul, 19ul, 23ul, 29ul, 31ul,
 37ul, 41ul, 43ul, 47ul, 53ul, 59ul, 61ul, 67ul, 71ul, 73ul, 79ul,
@@ -66,13 +66,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 1259520799ul, 1362662261ul, 1474249943ul, 1594975441ul, 1725587117ul,
 1866894511ul, 2019773507ul, 2185171673ul, 2364114217ul, 2557710269ul,
 2767159799ul, 2993761039ul, 3238918481ul, 3504151727ul, 3791104843ul,
-4101556399ul, 4294967291ul,
-// Sentinel, so we don't have to test the result of lower_bound,
-// or, on 64-bit machines, rest of the table.
-#if __SIZEOF_LONG__ != 8
-4294967291ul
-#else
-6442450933ul, 8589934583ul, 12884901857ul, 17179869143ul,
+4101556399ul, 4294967291ul
+#if __SIZEOF_LONG__ >= 8
+,6442450933ul, 8589934583ul, 12884901857ul, 17179869143ul,
 25769803693ul, 34359738337ul, 51539607367ul, 68719476731ul,
 103079215087ul, 137438953447ul, 206158430123ul, 274877906899ul,
 412316860387ul, 549755813881ul, 824633720731ul, 1099511627689ul,
@@ -86,7 +82,7