[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-26 Thread plasmahh at gmx dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #12 from Dennis Lubert plasmahh at gmx dot net 2012-07-26 
12:30:00 UTC ---
I can confirm that now the reserve works as I would expect it (causing no
further rehashes). However the amount of rehashes done in the testcase is still
155 (needing 4.5s), while gcc 4.6 only needs 21 (needing 1.5s).

Monitoring the bucket counts on resizes it seems that gcc 4.8 is now much more
fine grained than gcc 4.6. I am unsure if this is expected, deliberate and/or a
good idea.


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-26 Thread fdumont at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #13 from François Dumont fdumont at gcc dot gnu.org 2012-07-26 
12:31:56 UTC ---
Author: fdumont
Date: Thu Jul 26 12:31:50 2012
New Revision: 189889

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=189889
Log:
2012-07-26  François Dumont  fdum...@gcc.gnu.org

PR libstdc++/54075
* include/bits/hashtable.h
(_Hashtable::_Hashtable(_InputIterator, _InputIterator,
size_type, ...): Remove std::max usage to guarantee that hashtable
state is consistent with hash policy state.
(_Hashtable::rehash): Likewise. Set _M_prev_resize to 0 to avoid
the hashtable shrinking on next insertion.
* testsuite/23_containers/unordered_set/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_multiset/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_map/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_multimap/modifiers/reserve.cc: New.

Added:
   
branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/reserve.cc
   
branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_multimap/modifiers/reserve.cc
   
branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_multiset/modifiers/reserve.cc
   
branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_set/modifiers/reserve.cc
Modified:
branches/gcc-4_7-branch/libstdc++-v3/ChangeLog
branches/gcc-4_7-branch/libstdc++-v3/include/bits/hashtable.h


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-26 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #14 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-26 
17:36:28 UTC ---
In any case, please add the testcase showing 4.5s vs 1.5s.


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-26 Thread likan_999.student at sina dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #15 from likan_999.student at sina dot com 2012-07-26 22:10:21 UTC 
---
Tried the patch and just as Dennis Lubert pointed out, the number of rehashes
is not decreased.  Is there any plan to fix this issue?


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-26 Thread chip at pobox dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #16 from Chip Salzenberg chip at pobox dot com 2012-07-26 
22:50:17 UTC ---
In my tests, with this patch, 4.7.1 is about 10% slower than 4.6 ... a vast
improvement but certainly not parity.

./bench46  1.75s user 0.82s system 99% cpu 2.577 total
./bench47  8.01s user 2.78s system 99% cpu 10.800 total
./bench47+patch  1.95s user 0.80s system 99% cpu 2.764 total


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-26 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #17 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-26 
22:55:15 UTC ---
Because of more rehashing, unrelated to reserve, I suppose?


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-25 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #10 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-25 
09:56:15 UTC ---
A patch is available here:

  http://gcc.gnu.org/ml/libstdc++/2012-07/msg00051.html

Submitter and interested people can give it a try before it goes in.


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-25 Thread fdumont at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #11 from François Dumont fdumont at gcc dot gnu.org 2012-07-25 
19:32:53 UTC ---
Author: fdumont
Date: Wed Jul 25 19:32:48 2012
New Revision: 189863

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=189863
Log:
2012-07-25  François Dumont  fdum...@gcc.gnu.org

PR libstdc++/54075
* include/bits/hashtable.h
(_Hashtable::_Hashtable(_InputIterator, _InputIterator,
size_type, ...): Remove std::max usage to guarantee that hashtable
state is consistent with hash policy state.
(_Hashtable::rehash): Likewise. Set _M_prev_resize to 0 to avoid
the hashtable to be shrinking on next insertion.
* testsuite/23_containers/unordered_set/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_multiset/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_map/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_multimap/modifiers/reserve.cc: New.

Added:
   
trunk/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/reserve.cc
   
trunk/libstdc++-v3/testsuite/23_containers/unordered_multimap/modifiers/reserve.cc
   
trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/modifiers/reserve.cc
   
trunk/libstdc++-v3/testsuite/23_containers/unordered_set/modifiers/reserve.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/hashtable.h


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-24 Thread plasmahh at gmx dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

Dennis Lubert plasmahh at gmx dot net changed:

   What|Removed |Added

 CC||plasmahh at gmx dot net

--- Comment #8 from Dennis Lubert plasmahh at gmx dot net 2012-07-24 10:41:54 
UTC ---
I just stumbled over this bug while searching for something related to the max
load factor (it seems that since 4.7 the hashtable also shrinks when you set
the load factor higher, which is at least for me unfortunate).

I did just change the testcase to count the number of rehashes (by checking
bucket count before and after insert) and found:

gcc4.6 without reserve: 21
gcc4.6 with reserve: 1

gcc4.7 without reserve: 155
gcc4.7 with reserve: 160

Then in callgrind I had a look and most time seems to be spend in rehashing. So
I would assume that when the 4.7 version gets the number of rehashing down to
where it was, then we also get the speed down to where it was.

I would say that the reserve behaviour though is definetly broken, as it even
increases the amount of rehashings. I personally would just not have the
hashtable ever shrink on insert and/or load factor setting, just only ever on
explicit rehash() calls...


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-24 Thread fdumont at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #9 from François Dumont fdumont at gcc dot gnu.org 2012-07-24 
20:15:10 UTC ---
I confirm that the reserve method is broken. I had correctly handle the size
hint that can be given through the hashtable constructor, I set
_M_rehash_policy._M_prev_resize to 0 just to avoid the container to be shrink
until it reaches the targetted size. The same thing should be done in the
rehash method that is called from reserve one. I will submit a patch soon.

Regarding the shrink on insert calls, well, it can be easily avoided by calling
(normally) reserve so I considered that it is better to offer a container with
a symetric behavior.

I will also I think reconsider the grow factor. During one of my propositions
of hashtable redesign the number of empty buckets was a problem for
performance. With the current implementation it is not a problem anymore so we
could grow a little bit faster.


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-24 Thread fdumont at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

François Dumont fdumont at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 AssignedTo|unassigned at gcc dot   |fdumont at gcc dot gnu.org
   |gnu.org |


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-23 Thread likan_999.student at sina dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #1 from likan_999.student at sina dot com 2012-07-23 23:08:07 UTC 
---
Created attachment 27861
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27861
Profiling using google-perftools


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-23 Thread likan_999.student at sina dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #2 from likan_999.student at sina dot com 2012-07-23 23:09:43 UTC 
---
Created attachment 27862
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27862
Profiling of gcc-4.6.2 using google-perftools


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-23 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

Paolo Carlini paolo.carlini at oracle dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-07-23
 CC||fdumont at gcc dot gnu.org
 Ever Confirmed|0   |1
   Severity|critical|normal

--- Comment #3 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-23 
23:12:15 UTC ---
Francois, can you have a look? Thanks!


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-23 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #4 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-23 
23:23:27 UTC ---
I wonder, anyway, if the apparent slow down is just an artifact caused by a
different handling of the load factor in the reworked unordered containers
which we have been delivering since 4.7.0. I would suggest submitter to
experiment a bit with max_load_factor, and see if when 4.6.x seems faster at
insertion time actually the load factor is higher (too searching would be
slower).


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-23 Thread likan_999.student at sina dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #5 from likan_999.student at sina dot com 2012-07-24 00:17:10 UTC 
---
@Paolo Carlini: can you talk more about how to experiment with max_load_factor?
 As long as I use the same max_load_factor for 4.6 and 4.7, I can still see the
performance difference (3x slower is the best result):

max_load_factor  gcc-4.6.2 gcc-4.7.1
0.2  1.600s  7.650s
0.5  1.125s  4.251s
1.0  1.004s  3.378s
2.0  0.914s  20.471s
5.0  0.917s  29.132s


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-23 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #6 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-24 
00:29:38 UTC ---
In some cases 4.6.x was handling max_load_factor incorrectly. Thus, the idea
isn't comparing 4.6.x to 4.7.x with the same max_load_factor (I don't think
it's useful), instead, actually measure load_factor(s), see if for some values
of max_load_factor, the actual load_factor(s) are different in 4.6.x vs 4.7.x.
In any case, measure the actual load_factor while the map grows.


[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2

2012-07-23 Thread likan_999.student at sina dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075

--- Comment #7 from likan_999.student at sina dot com 2012-07-24 00:42:57 UTC 
---
@Paolo Carlini: the problem is, with different max_load_factor in range
[0.2-5], the *best* result of 4.7.1 is still 2x slower than the *worst* of
4.6.2.  

I printed out the average load factor during the insertion.  Looks like 4.7.1's
load factor is very close to the max_load_factor, and the average for 4.6.2 is
about 1/4 of the max_load_factor.  But what conclusion can we get from this
result?