A small regression noticed while merging.

We shouldn't keep on using a moved-from key_type instance.

Ok to commit ? Feel free to do it if you prefer, I'll do so at end of Europe day otherwise.

François


On 6/17/19 12:28 PM, Jonathan Wakely wrote:
On 07/06/19 18:39 +0200, François Dumont wrote:
On 6/5/19 6:22 PM, Jonathan Wakely wrote:
On 04/06/19 19:19 +0200, François Dumont wrote:
Hi

    Here is a patch to enhance the _Hashtable extract node API and fix a FIXME request.

    The enhancement to the extract node Api is that extract(const key_type&) do not call extract(const_iterator) anymore. Doing so we had to loop again through bucket nodes to find the previous node to the one to extract. Even if a bucket shall not contain many nodes (in unique key mode) it's easy to avoid it.

Nice.

    To fix the FIXME I introduced a node smart pointer type managing the node lifetime. The node is extracted from this smart pointer only when there can't be any exception raised. In the context of the node extract api the node handle is considered as a smart pointer. So the node handle will remain owner of the node in case of exception when reinserting it, I hope it is the expected behavior.

Yes, that's right.

I was going to suggest just using the node handle type instead of
inventing a new smart pointer, but the handle type uses std::optional
so isn't available for C++11/14.

I considered it too, or even use shared_ptr but thought it was overkill.



    * include/bits/hashtable_policy.h
    (struct _NodeSmartPointer<_NodeAlloc>): New.
    (_Map_base<>::operator[](const key_type&)): Use latter, adapt.
    (_Map_base<>::operator[](key_type&&)): Likewise.
    * include/bits/hashtable.h
    (_Hashtable<>::__node_sp_t): New.
    (_Hashtable<>::_M_insert_unique_node(size_type, __hash_code,
    __node_type*, size_type)): Replace by...
(_Hashtable<>::_M_insert_unique_node<_NodeAccessor>(const key_type&,
    size_type, __hash_code, const _NodeAccessor&, size_type)): ...that.
    (_Hashtable<>::_M_insert_multi_node(__node_type*, __hash_code,
    __node_type*)): Replace by...
(_Hashtable<>::_M_insert_multi_node<_NodeAccessor>(__node_type*,
    __hash_code, const _NodeAccessor&)): ...that.
    (_Hashtable<>::_M_reinsert_node): Adapt.
    (_Hashtable<>::_M_reinsert_node_multi): Adapt.
    (_Hashtable<>::_M_extract_node(size_t, __node_base*)): New.
    (_Hashtable<>::extract(const_iterator)): Use latter.
    (_Hashtable<>::extract(const _Key&)): Likewise.
    (_Hashtable<>::_M_merge_unique): Adapt.
    (_Hashtable<>::_M_emplace<_Args>(true_type, _Args&&...)): Adapt.
(_Hashtable<>::_M_emplace<_Args>(const_iterator, false_type,
    _Args&&...)): Adapt.

Tested under Linux x86_64.

Ok to commit ?

François


diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
index e2e3f016a35..307865b96bf 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -197,6 +197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
      using __hash_cached = typename __traits_type::__hash_cached;
      using __node_type = __detail::_Hash_node<_Value, __hash_cached::value>;
      using __node_alloc_type = __alloc_rebind<_Alloc, __node_type>;
+      using __node_sp_t = __detail::_NodeSmartPointer<__node_alloc_type>;

      using __hashtable_alloc = __detail::_Hashtable_alloc<__node_alloc_type>;

@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
      __node_base*
      _M_get_previous_node(size_type __bkt, __node_base* __n);

-      // Insert node with hash code __code, in bucket bkt if no rehash (assumes -      // no element with its key already present). Take ownership of the node,
-      // deallocate it on exception.
+      // Insert node with key __k and hash code __code, in bucket __bkt if no
+      // rehash (assumes no element with its key already present).
+      template<typename _NodeAccessor>
    iterator
-      _M_insert_unique_node(size_type __bkt, __hash_code __code,
-                __node_type* __n, size_type __n_elt = 1);
+    _M_insert_unique_node(const key_type& __k, size_type __bkt,
+                  __hash_code __code, const _NodeAccessor&,
+                  size_type __n_elt = 1);

-      // Insert node with hash code __code. Take ownership of the node,
-      // deallocate it on exception.
+      // Insert node with hash code __code.
+      template<typename _NodeAccessor>
    iterator
-      _M_insert_multi_node(__node_type* __hint,
-               __hash_code __code, __node_type* __n);
+    _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+                 const _NodeAccessor& __node_accessor);

It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object to provide the node pointer doesn't really seem
necessary anyway, so if it results in larger executables it's really
not desirable.


Maybe one the reasons we have a difference when running performance tests in -03. I don't know why I abused so much of lambdas when I see your proposal.



The attached patch still just passes in a node pointer (which the
function takes ownership of, unless it throws). Because the semantics
of _M_insert_multi_node change, we need the symbol name to change as
well (so old code linking to the old symbol doesn't find the new
definition with different semantics). Passing in the key makes it a
different symbol (in almost all cases the caller has the key available
already anyway).
Very nice.

An alternative design would be for _M_insert_(unique|multi)_node to
take a reference to the RAII type and set its _M_node member to null
after taking ownership of that pointer. That would be more explicit
about the ownership transfer, however it would require changes to the
_M_reinsert_node and _M_reinsert_node_multi functions which don't use
the RAII type (because the __node_type* is already owned by a node
handle).


      template<typename... _Args>
    std::pair<iterator, bool>
@@ -805,9 +807,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
          }
        else
          {
-        __ret.position
-          = _M_insert_unique_node(__bkt, __code, __nh._M_ptr);
-        __nh._M_ptr = nullptr;
+        __ret.position = _M_insert_unique_node(__k, __bkt, __code,
+                [&__nh]()
+                { return std::exchange(__nh._M_ptr, nullptr); });

The existing code here can just be changed to pass in __k, but still
set __nh._M_ptr to null after the call returns.

i.e.

        __ret.position
          = _M_insert_unique_node(__k, __bkt, __code, __nh._M_ptr);
        __nh._M_ptr = nullptr;


        __ret.inserted = true;
          }
      }
@@ -818,33 +820,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
      iterator
      _M_reinsert_node_multi(const_iterator __hint, node_type&& __nh)
      {
-    iterator __ret;
    if (__nh.empty())
-      __ret = end();
-    else
-      {
+      return end();
+
    __glibcxx_assert(get_allocator() == __nh.get_allocator());

    auto __code = this->_M_hash_code(__nh._M_key());
-        auto __node = std::exchange(__nh._M_ptr, nullptr);
-        // FIXME: this deallocates the node on exception.
-        __ret = _M_insert_multi_node(__hint._M_cur, __code, __node);
-      }
-    return __ret;
+    return _M_insert_multi_node(__hint._M_cur, __code,
+                  [&__nh]()
+                  { return std::exchange(__nh._M_ptr, nullptr); });

Now that the call won't deallocate anything, we can fix the FIXME by
just delaying resetting __nh._M_ptr until the call has returned:

       auto __ret = _M_insert_multi_node(__hint._M_cur, __code, __node);
       __nh._M_ptr = nullptr;
       return __ret;


      }

+    private:
      /// Extract a node.

We don't want a Doxygen "///" comment on a private function, just "//"
is OK. In this case I don't think we need any comment, the
_M_extract_node name is clear enough. Saying "extract a node" adds
nothing.


      node_type
-      extract(const_iterator __pos)
+      _M_extract_node(size_t __bkt, __node_base* __prev_n)
      {
-    __node_type* __n = __pos._M_cur;
-    size_t __bkt = _M_bucket_index(__n);
-
-    // Look for previous node to unlink it from the erased one, this
-    // is why we need buckets to contain the before begin to make
-    // this search fast.
-    __node_base* __prev_n = _M_get_previous_node(__bkt, __n);
-
+    __node_type* __n = static_cast<__node_type*>(__prev_n->_M_nxt);
    if (__prev_n == _M_buckets[__bkt])
      _M_remove_bucket_begin(__bkt, __n->_M_next(),
         __n->_M_nxt ? _M_bucket_index(__n->_M_next()) : 0);
@@ -861,14 +853,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
    return { __n, this->_M_node_allocator() };
      }

+    public:
+      /// Extract a node.
+      node_type
+      extract(const_iterator __pos)
+      {
+    size_t __bkt = _M_bucket_index(__pos._M_cur);
+    return _M_extract_node(__bkt, _M_get_previous_node(__bkt, __pos._M_cur));
+      }
+
      /// Extract a node.
      node_type
      extract(const _Key& __k)
      {
    node_type __nh;
-    auto __pos = find(__k);
-    if (__pos != end())
-      __nh = extract(const_iterator(__pos));
+    __hash_code __code = this->_M_hash_code(__k);
+    std::size_t __bkt = _M_bucket_index(__k, __code);
+    if (__node_base* __prev_node = _M_find_before_node(__bkt, __k, __code))
+      __nh = _M_extract_node(__bkt, __prev_node);
    return __nh;
      }

Looks good.

@@ -885,14 +887,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
      for (auto __i = __src.begin(), __end = __src.end(); __i != __end;)
        {
          auto __pos = __i++;
-          const key_type& __k = this->_M_extract()(__pos._M_cur->_M_v());
+          const key_type& __k = this->_M_extract()(*__pos);
          __hash_code __code = this->_M_hash_code(__k);
          size_type __bkt = _M_bucket_index(__k, __code);
          if (_M_find_node(__bkt, __k, __code) == nullptr)
        {
          auto __nh = __src.extract(__pos);
-          _M_insert_unique_node(__bkt, __code, __nh._M_ptr, __n_elt);
-          __nh._M_ptr = nullptr;
+          _M_insert_unique_node(__k, __bkt, __code,
+                [&__nh]()
+                { return std::exchange(__nh._M_ptr, nullptr); },
+                __n_elt);

Again, the old code seems fine. Just pass in __k.

          __n_elt = 1;
        }
          else if (__n_elt != 1)
@@ -1634,31 +1638,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
      -> pair<iterator, bool>
      {
    // First build the node to get access to the hash code
-    __node_type* __node
-      = this->_M_allocate_node(std::forward<_Args>(__args)...);
-    const key_type& __k = this->_M_extract()(__node->_M_v());
-    __hash_code __code;
-    __try
-      {
-        __code = this->_M_hash_code(__k);
-      }
-    __catch(...)
-      {
-        this->_M_deallocate_node(__node);

Aside: It's confusing that _M_allocate_node actually allocates a node
and constructs an element, and _M_deallocate_node destroys the element
and deallocates the node (and we have _M_deallocate_node_ptr which
just does the deallocation part). We should add comments explaining
that.


diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index 878154ae210..ddcff0ec9d0 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -170,6 +170,40 @@ namespace __detail
      __hashtable_alloc& _M_h;
    };

+  // Node smart pointer to benefit from RAII.
+  template<typename _NodeAlloc>
+    struct _NodeSmartPointer
+    {
+    private:
+      using __hashtable_alloc = _Hashtable_alloc<_NodeAlloc>;
+      using __node_type = typename __hashtable_alloc::__node_type;
+
+    public:
+      _NodeSmartPointer(__hashtable_alloc& __h, __node_type* __node) noexcept
+      : _M_h(__h), _M_node(__node) { }
+
+      ~_NodeSmartPointer()
+      {
+    if (_M_node)
+      _M_h._M_deallocate_node(_M_node);
+      }
+
+      __node_type*
+      _M_get() const { return _M_node; }
+
+      __node_type*
+      _M_release() noexcept
+      {
+    __node_type* __tmp = _M_node;
+    _M_node = nullptr;
+    return __tmp;
+      }
+
+    private:
+      __hashtable_alloc& _M_h;
+      __node_type* _M_node;
+    };

This can be defined as a member of _Hashtable, and can be much
simpler:

     // Simple RAII type for managing a node containing an element
     struct _Scoped_node
     {
    _Scoped_node(__hashtable_alloc* __h, __node_type* __n)
    : _M_h(__h), _M_node(__n) { }

    ~_Scoped_node() { if (_M_node) _M_h->_M_deallocate_node(_M_node); };

    _Scoped_node(const _Scoped_node&) = delete;
    _Scoped_node& operator=(const _Scoped_node&) = delete;

    __hashtable_alloc* _M_h;
    __node_type* _M_node;
     };

I've attached a patch implementing these suggested changes. I think
it's simpler this way, what do you think?

I think it is much better than what I proposed.



(This patch is generated with 'git diff -b' so whitespace changes
aren't shown, so this shouldn't be applied directly. I can send
another version with the whitespace changes that is suitable to
commit).

Mine was generated with 'git diff -w' for the same reason.

You can send the full patch to me if you prefer but feel free to commit it yourself.

Here's what I've committed. Compared to the previous patch this adds a
constructor to _Scoped_node that creates the new node, so the caller
doesn't have to create it:

      // Allocate a node and construct an element within it.
      template<typename... _Args>
        _Scoped_node(__hashtable_alloc* __h, _Args&&... __args)
        : _M_h(__h),
_M_node(__h->_M_allocate_node(std::forward<_Args>(__args)...))
        { }

Tested x86_64-linux, committed to trunk.



diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h
index f5809c7443a..7e89e1b44c4 100644
--- a/libstdc++-v3/include/bits/hashtable_policy.h
+++ b/libstdc++-v3/include/bits/hashtable_policy.h
@@ -743,7 +743,8 @@ namespace __detail
 	std::tuple<>()
       };
       auto __pos
-	= __h->_M_insert_unique_node(__k, __bkt, __code, __node._M_node);
+	= __h->_M_insert_unique_node(__h->_M_extract()(__node._M_node->_M_v()),
+				     __bkt, __code, __node._M_node);
       __node._M_node = nullptr;
       return __pos->second;
     }

Reply via email to