[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #12 from Dennis Lubert plasmahh at gmx dot net 2012-07-26 12:30:00 UTC --- I can confirm that now the reserve works as I would expect it (causing no further rehashes). However the amount of rehashes done in the testcase is still 155 (needing 4.5s), while gcc 4.6 only needs 21 (needing 1.5s). Monitoring the bucket counts on resizes it seems that gcc 4.8 is now much more fine grained than gcc 4.6. I am unsure if this is expected, deliberate and/or a good idea.
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #13 from François Dumont fdumont at gcc dot gnu.org 2012-07-26 12:31:56 UTC --- Author: fdumont Date: Thu Jul 26 12:31:50 2012 New Revision: 189889 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=189889 Log: 2012-07-26 François Dumont fdum...@gcc.gnu.org PR libstdc++/54075 * include/bits/hashtable.h (_Hashtable::_Hashtable(_InputIterator, _InputIterator, size_type, ...): Remove std::max usage to guarantee that hashtable state is consistent with hash policy state. (_Hashtable::rehash): Likewise. Set _M_prev_resize to 0 to avoid the hashtable shrinking on next insertion. * testsuite/23_containers/unordered_set/modifiers/reserve.cc: New. * testsuite/23_containers/unordered_multiset/modifiers/reserve.cc: New. * testsuite/23_containers/unordered_map/modifiers/reserve.cc: New. * testsuite/23_containers/unordered_multimap/modifiers/reserve.cc: New. Added: branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/reserve.cc branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_multimap/modifiers/reserve.cc branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_multiset/modifiers/reserve.cc branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_set/modifiers/reserve.cc Modified: branches/gcc-4_7-branch/libstdc++-v3/ChangeLog branches/gcc-4_7-branch/libstdc++-v3/include/bits/hashtable.h
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #14 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-26 17:36:28 UTC --- In any case, please add the testcase showing 4.5s vs 1.5s.
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #15 from likan_999.student at sina dot com 2012-07-26 22:10:21 UTC --- Tried the patch and just as Dennis Lubert pointed out, the number of rehashes is not decreased. Is there any plan to fix this issue?
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #16 from Chip Salzenberg chip at pobox dot com 2012-07-26 22:50:17 UTC --- In my tests, with this patch, 4.7.1 is about 10% slower than 4.6 ... a vast improvement but certainly not parity. ./bench46 1.75s user 0.82s system 99% cpu 2.577 total ./bench47 8.01s user 2.78s system 99% cpu 10.800 total ./bench47+patch 1.95s user 0.80s system 99% cpu 2.764 total
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #17 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-26 22:55:15 UTC --- Because of more rehashing, unrelated to reserve, I suppose?
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #10 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-25 09:56:15 UTC --- A patch is available here: http://gcc.gnu.org/ml/libstdc++/2012-07/msg00051.html Submitter and interested people can give it a try before it goes in.
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #11 from François Dumont fdumont at gcc dot gnu.org 2012-07-25 19:32:53 UTC --- Author: fdumont Date: Wed Jul 25 19:32:48 2012 New Revision: 189863 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=189863 Log: 2012-07-25 François Dumont fdum...@gcc.gnu.org PR libstdc++/54075 * include/bits/hashtable.h (_Hashtable::_Hashtable(_InputIterator, _InputIterator, size_type, ...): Remove std::max usage to guarantee that hashtable state is consistent with hash policy state. (_Hashtable::rehash): Likewise. Set _M_prev_resize to 0 to avoid the hashtable to be shrinking on next insertion. * testsuite/23_containers/unordered_set/modifiers/reserve.cc: New. * testsuite/23_containers/unordered_multiset/modifiers/reserve.cc: New. * testsuite/23_containers/unordered_map/modifiers/reserve.cc: New. * testsuite/23_containers/unordered_multimap/modifiers/reserve.cc: New. Added: trunk/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/reserve.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multimap/modifiers/reserve.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/modifiers/reserve.cc trunk/libstdc++-v3/testsuite/23_containers/unordered_set/modifiers/reserve.cc Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/include/bits/hashtable.h
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 Dennis Lubert plasmahh at gmx dot net changed: What|Removed |Added CC||plasmahh at gmx dot net --- Comment #8 from Dennis Lubert plasmahh at gmx dot net 2012-07-24 10:41:54 UTC --- I just stumbled over this bug while searching for something related to the max load factor (it seems that since 4.7 the hashtable also shrinks when you set the load factor higher, which is at least for me unfortunate). I did just change the testcase to count the number of rehashes (by checking bucket count before and after insert) and found: gcc4.6 without reserve: 21 gcc4.6 with reserve: 1 gcc4.7 without reserve: 155 gcc4.7 with reserve: 160 Then in callgrind I had a look and most time seems to be spend in rehashing. So I would assume that when the 4.7 version gets the number of rehashing down to where it was, then we also get the speed down to where it was. I would say that the reserve behaviour though is definetly broken, as it even increases the amount of rehashings. I personally would just not have the hashtable ever shrink on insert and/or load factor setting, just only ever on explicit rehash() calls...
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #9 from François Dumont fdumont at gcc dot gnu.org 2012-07-24 20:15:10 UTC --- I confirm that the reserve method is broken. I had correctly handle the size hint that can be given through the hashtable constructor, I set _M_rehash_policy._M_prev_resize to 0 just to avoid the container to be shrink until it reaches the targetted size. The same thing should be done in the rehash method that is called from reserve one. I will submit a patch soon. Regarding the shrink on insert calls, well, it can be easily avoided by calling (normally) reserve so I considered that it is better to offer a container with a symetric behavior. I will also I think reconsider the grow factor. During one of my propositions of hashtable redesign the number of empty buckets was a problem for performance. With the current implementation it is not a problem anymore so we could grow a little bit faster.
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 François Dumont fdumont at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |fdumont at gcc dot gnu.org |gnu.org |
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #1 from likan_999.student at sina dot com 2012-07-23 23:08:07 UTC --- Created attachment 27861 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27861 Profiling using google-perftools
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #2 from likan_999.student at sina dot com 2012-07-23 23:09:43 UTC --- Created attachment 27862 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27862 Profiling of gcc-4.6.2 using google-perftools
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 Paolo Carlini paolo.carlini at oracle dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2012-07-23 CC||fdumont at gcc dot gnu.org Ever Confirmed|0 |1 Severity|critical|normal --- Comment #3 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-23 23:12:15 UTC --- Francois, can you have a look? Thanks!
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #4 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-23 23:23:27 UTC --- I wonder, anyway, if the apparent slow down is just an artifact caused by a different handling of the load factor in the reworked unordered containers which we have been delivering since 4.7.0. I would suggest submitter to experiment a bit with max_load_factor, and see if when 4.6.x seems faster at insertion time actually the load factor is higher (too searching would be slower).
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #5 from likan_999.student at sina dot com 2012-07-24 00:17:10 UTC --- @Paolo Carlini: can you talk more about how to experiment with max_load_factor? As long as I use the same max_load_factor for 4.6 and 4.7, I can still see the performance difference (3x slower is the best result): max_load_factor gcc-4.6.2 gcc-4.7.1 0.2 1.600s 7.650s 0.5 1.125s 4.251s 1.0 1.004s 3.378s 2.0 0.914s 20.471s 5.0 0.917s 29.132s
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #6 from Paolo Carlini paolo.carlini at oracle dot com 2012-07-24 00:29:38 UTC --- In some cases 4.6.x was handling max_load_factor incorrectly. Thus, the idea isn't comparing 4.6.x to 4.7.x with the same max_load_factor (I don't think it's useful), instead, actually measure load_factor(s), see if for some values of max_load_factor, the actual load_factor(s) are different in 4.6.x vs 4.7.x. In any case, measure the actual load_factor while the map grows.
[Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075 --- Comment #7 from likan_999.student at sina dot com 2012-07-24 00:42:57 UTC --- @Paolo Carlini: the problem is, with different max_load_factor in range [0.2-5], the *best* result of 4.7.1 is still 2x slower than the *worst* of 4.6.2. I printed out the average load factor during the insertion. Looks like 4.7.1's load factor is very close to the max_load_factor, and the average for 4.6.2 is about 1/4 of the max_load_factor. But what conclusion can we get from this result?