Re: [PATCH] libstdc++/91223 Improve unordered containers == operator

François Dumont Thu, 16 Jan 2020 13:39:04 -0800

On 1/16/20 5:01 PM, Jonathan Wakely wrote:

On 16/01/20 13:25 +0000, Jonathan Wakely wrote:

On 16/01/20 07:42 +0100, François Dumont wrote:
On 1/15/20 10:52 PM, Jonathan Wakely wrote:
On 15/01/20 21:48 +0000, Jonathan Wakely wrote:
On 14/01/20 22:25 +0100, François Dumont wrote:
On 1/13/20 10:53 PM, Jonathan Wakely wrote:
On 13/01/20 22:41 +0100, François Dumont wrote:
For the multi-keys we could still avoid redundant comparisonswhen _Equal is just doing == on the key type. Onunordered_multiset we could just avoids the call tois_permuation and on the unordered_multimap we could check theis_permutation only on the associated value rather than on thestd::pair.
I don't think that's necessary, or helpful.
The idea ofhttps://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
that you shouldn't be using _Equal at all, and therefore it doesn't
matter whether it's std::equal_to or not.
And it was indeed possible.
Nice!
    PR libstdc++/91223
* include/bits/hashtable.h (_Hashtable<>): Make _Equality<>friend.
    * include/bits/hashtable_policy.h: Include <bits/stl_algo.h>.
    (_Equality_base): Remove.
(_Equality<>::_M_equal): Review implementation. Usestd::is_permutation.
    * testsuite/23_containers/unordered_multiset/operators/1.cc
    (Hash, Equal, test02, test03): New.
    * testsuite/23_containers/unordered_set/operators/1.cc
    (Hash, Equal, test02, test03): New.

Tested under Linux x86_64.

Ok to commit ?
Yes, OK for trunk (we're in stage4 but your patch was posted instage3
and fixes a pretty nasty performance bug, so is OK now).
N.B. GCC has moved to Git instead of Subversion. If you don't haveGit
access set up let me know and I can commit the patch for you.
I haven't done the move yet and won't be able to do it before theweek-end. So please proceed to the commit for me, thanks.
No problem, I can do that.


Your patch is now committed to trunk. Thanks for the major
improvement.

I had a look at std::is_permutation and I think we can make some
simplifications to the 4-argument overload, and we can share most of
the code between the 3-arg and 4-arg overloads (once they've confirmed
the lengths are the same they do exactly the same thing). See the
attached patch. This should probably wait for stage1 though.

I also wanted to add some comments to the _Equality::_M_equal
specialiation for unordered_multimap/multiset to explain what the code
was doing, and had some more ideas. See patch again.

It looks like this loop can potentially visit every element of
__other, instead of stopping at the end of the bucket:

  typename __hashtable::const_iterator __ity(__y_n);
  for (auto __ity_end = __ity; __ity_end != __other.end(); ++__ity_end)
    if (--__x_count == 0)
      break;

Consider a case like this:

unordered_multiset<int> a{1, 2, 3, 4};
for (int i = 0; i <10000; ++i)
  a.insert(1);
unordered_multiset<int> b{1, 2, 3, 4};
for (int i = 0; i <10000; ++i)
  b.insert(2);

When doing a == b we'll find 10000 elements in a with key '1',
and find one element in b with that key. But then we iterate through
every element in b after that one, even though they have different
keys and are probably in different buckets.

Instead of just iterating from __ity to __other.end(), can we use a
local iterator so we stop at the end of the bucket?

This seems to make the PR91263 example *very* slightly slower, but
makes the example above significantly faster.

What do you think?

The hashtable implementation is doing its best to provide goodperformances as long as the user does its part of the job. Mainlyprovide a good hash to distrubute elements smoothly throughout thebuckets. But also avoid this kind of unordered_multiset.

If you check if you don't move out of bucket you'll have to pay for thebucket computation (subject of PR 68303) or perform a redundant _Equalto check when we left the range of equivalent elements like it used tobe done. Current implementation leave it to the std::is_permutation todo that which in normal situation will be better I think.

Re: [PATCH] libstdc++/91223 Improve unordered containers == operator

Reply via email to