Ping ?
On 9/9/19 8:34 PM, François Dumont wrote:
Hi
This patch improves stl_algobase.h
copy/copy_backward/fill/fill_n/equal implementations. The improvements
are:
- activation of algo specialization for __gnu_debug::_Safe_iterator
(w/o _GLIBCXX_DEBUG mode)
- activation of algo specialization for _Deque_iterator even if mixed
with another kind of iterator.
- activation of algo specializations __copy_move_a2 for something else
than pointers. For example this code:
std::vector<char> v { 'a', 'b', .... };
ostreambuf_iterator out(std::cout);
std::copy(v.begin(), v.end(), out);
is not calling the specialization __copy_move_a2(const char*, const
char*, ostreambuf_iterator<>);
It also fix a _GLIBCXX_DEBUG issue where the __niter_base
specialization was wrongly removing the _Safe_iterator<> layer. The
testsuite/25_algorithms/copy/debug/1_neg.cc test case was failing on a
debug assertion because _after_ the copy we were trying to increment
the vector iterator after past-the-end. Of course the problem is the
_after_, Debug mode should detect this _before_ it takes place which
it does now.
Note that std::fill_n is now making use of std::fill for some
optimizations dealing with random access iterators.
Performances are very good:
Before:
copy_backward_deque_iterators.cc deque 2 deque 1084r 1084u
0s 0mem 0pf
copy_backward_deque_iterators.cc deque 2 vector 3373r 3372u
0s 0mem 0pf
copy_backward_deque_iterators.cc vector 2 deque 3316r 3316u
0s 0mem 0pf
copy_backward_deque_iterators.cc int deque 2 char vector 3610r
3609u 0s 0mem 0pf
copy_backward_deque_iterators.cc char vector 2 int deque 3552r
3552u 0s 0mem 0pf
copy_backward_deque_iterators.cc deque 2 list 10528r 10528u
0s 0mem 0pf
copy_backward_deque_iterators.cc list 2 deque 2161r 2162u
0s 0mem 0pf
copy_deque_iterators.cc deque 2 deque 752r
751u 0s 0mem 0pf
copy_deque_iterators.cc deque 2 vector 3300r
3299u 0s 0mem 0pf
copy_deque_iterators.cc vector 2 deque 3144r
3140u 0s 0mem 0pf
copy_deque_iterators.cc int deque 2 char vector 3340r
3338u 1s 0mem 0pf
copy_deque_iterators.cc char vector 2 int deque 3132r
3132u 0s 0mem 0pf
copy_deque_iterators.cc deque 2 list 10013r
10012u 0s 0mem 0pf
copy_deque_iterators.cc list 2 deque 2274r
2275u 0s 0mem 0pf
equal_deque_iterators.cc deque vs deque 8676r
8675u 0s 0mem 0pf
equal_deque_iterators.cc deque vs vector 5870r
5870u 0s 0mem 0pf
equal_deque_iterators.cc vector vs deque 3163r
3163u 0s 0mem 0pf
equal_deque_iterators.cc int deque vs char vector 5845r
5845u 0s 0mem 0pf
equal_deque_iterators.cc char vector vs int deque 3307r
3307u 0s 0mem 0pf
After:
copy_backward_deque_iterators.cc deque 2 deque 697r 697u
0s 0mem 0pf
copy_backward_deque_iterators.cc deque 2 vector 219r 218u
0s 0mem 0pf
copy_backward_deque_iterators.cc vector 2 deque 453r 453u
0s 0mem 0pf
copy_backward_deque_iterators.cc int deque 2 char vector 1914r
1915u 0s 0mem 0pf
copy_backward_deque_iterators.cc char vector 2 int deque 2112r
2111u 0s 0mem 0pf
copy_backward_deque_iterators.cc deque 2 list 7770r 7771u
0s 0mem 0pf
copy_backward_deque_iterators.cc list 2 deque 2194r 2193u
0s 0mem 0pf
copy_deque_iterators.cc deque 2 deque 505r
504u 0s 0mem 0pf
copy_deque_iterators.cc deque 2 vector 221r
221u 0s 0mem 0pf
copy_deque_iterators.cc vector 2 deque 398r
397u 0s 0mem 0pf
copy_deque_iterators.cc int deque 2 char vector 1770r
1767u 0s 0mem 0pf
copy_deque_iterators.cc char vector 2 int deque 1995r
1993u 0s 0mem 0pf
copy_deque_iterators.cc deque 2 list 7650r
7641u 2s 0mem 0pf
copy_deque_iterators.cc list 2 deque 2270r
2270u 0s 0mem 0pf
equal_deque_iterators.cc deque vs deque 769r
768u 0s 0mem 0pf
equal_deque_iterators.cc deque vs vector 231r
230u 0s 0mem 0pf
equal_deque_iterators.cc vector vs deque 397r
397u 0s 0mem 0pf
equal_deque_iterators.cc int deque vs char vector 1541r
1541u 0s 0mem 0pf
equal_deque_iterators.cc char vector vs int deque 1623r
1623u 0s 0mem 0pf
In Debug Mode it is of course even better. I haven't had the patience
to run the benches before the patch, it just takes hours to run. So
here is just the After part:
copy_backward_deque_iterators.cc deque 2 deque 1128r 1128u
0s 0mem 0pf
copy_backward_deque_iterators.cc deque 2 vector 616r 616u
0s 0mem 0pf
copy_backward_deque_iterators.cc vector 2 deque 856r 855u
0s 0mem 0pf
copy_backward_deque_iterators.cc int deque 2 char vector 2277r
2277u 0s 0mem 0pf
copy_backward_deque_iterators.cc char vector 2 int deque 2518r
2519u 0s 0mem 0pf
copy_backward_deque_iterators.cc deque 2 list 8029r 8028u
0s 0mem 0pf
copy_backward_deque_iterators.cc list 2 deque 10418r 10416u
0s 0mem 0pf
copy_deque_iterators.cc deque 2 deque 931r
930u 0s 0mem 0pf
copy_deque_iterators.cc deque 2 vector 613r
613u 0s 0mem 0pf
copy_deque_iterators.cc vector 2 deque 794r
795u 0s 0mem 0pf
copy_deque_iterators.cc int deque 2 char vector 2192r
2192u 0s 0mem 0pf
copy_deque_iterators.cc char vector 2 int deque 2365r
2364u 0s 0mem 0pf
copy_deque_iterators.cc deque 2 list 8009r
8010u 0s 0mem 0pf
copy_deque_iterators.cc list 2 deque 10979r
10970u 1s 0mem 0pf
equal_deque_iterators.cc deque vs deque 1034r
1032u 0s 0mem 0pf
equal_deque_iterators.cc deque vs vector 481r
482u 0s 0mem 0pf
equal_deque_iterators.cc vector vs deque 646r
646u 0s 0mem 0pf
equal_deque_iterators.cc int deque vs char vector 1802r
1802u 0s 0mem 0pf
equal_deque_iterators.cc char vector vs int deque 1867r
1865u 0s 0mem 0pf
Note that copy/copy_backward 'list 2 deque' is still much slower
because for the moment we can't remove the debug layer. I plan to do a
refinement to also cover this use case.
In Debug implementations I have duplicated the Debug checks because
those algo specialization will also be used when users are directly
using <debug/vector> for instance without defining _GLIBCXX_DEBUG.
All algos tests passed except the constexpr ones in Debug mode, I've
put this on my todo list.
Ok to commit once I completed all the other tests ?
François