On Thu, Jan 8, 2026 at 10:38 AM Tomasz Kaminski <[email protected]> wrote:
> I have performed some microbenchmarking, on following example: > * bench_empty: input is empty istreamstream, creates iterator and compares > against end > * bench_stream_create: only resetting stringstream with new content > * bench_preinc: for loop over using pre increment > * bench_postinc: for loop over using post increment > * bench_deref_posting: while with *it++ > > This are the result I have observed, for the changes with the patch: > -------------------------------------------------------------- > Benchmark Time CPU Iterations > -------------------------------------------------------------- > bench_empty 2.10 ns 2.09 ns 337921434 > bench_stream_create 10.6 ns 10.6 ns 65736840 > bench_preinc 419 ns 419 ns 1662732 > bench_postinc 436 ns 435 ns 1609828 > bench_deref_postinc 306 ns 305 ns 22866 > > And the result before the patch: > -------------------------------------------------------------- > Benchmark Time CPU Iterations > -------------------------------------------------------------- > bench_empty 2.27 ns 2.26 ns 314944156 > bench_stream_create 11.7 ns 11.6 ns 60679957 > bench_preinc 873 ns 871 ns 807352 > bench_postinc 920 ns 917 ns 751565 > bench_deref_postinc 706 ns 704 ns 1038942 > > For sanity, I have also check the result on g++ 15.2 that ships with my > system: > -------------------------------------------------------------- > Benchmark Time CPU Iterations > -------------------------------------------------------------- > bench_empty 2.30 ns 2.30 ns 303070664 > bench_stream_create 10.6 ns 10.6 ns 65778662 > bench_preinc 767 ns 765 ns 915085 > bench_postinc 767 ns 764 ns 913985 > bench_deref_postinc 675 ns 673 ns 1107772 > > So it looks like eliminating save to member from both equal and operator* > makes implementation > two times faster. I will look into that. > > Attaching also my benchmark file. > > Regards, > Tomasz > > On Wed, Jan 7, 2026 at 3:41 PM Tomasz Kamiński <[email protected]> > wrote: > >> The warning was produced by following seqeunce, given an >> istream_iterator<char> >> it, such that *it will result in hitting EoF in it->_M_get(), and thus >> clearing >> _M_sbuf, the subsequent call to ++it, will result in _M_sbuf->sbumpc() >> call >> on null pointer dereference. This is however an false-positive, as in >> such sitution >> it == istream_iteator() returns true, and the iterator should not be >> dereferenced >> in first place. >> >> This patch addresses the above by clearing the _M_sbuf in operator++, >> instead of >> _M_get(). This removes the need for making _M_sbuf mutable, and thus make >> the >> implementation conforming with regards to C++11 [res.on.data.races] p3. >> >> This change should not or positive have performance impact on the usual >> iteration >> patterns, in form: >> while (it != end) { process(*it); ++it; } >> In case when it is end-of-stream iterator, the it != end returns in one >> call of >> _M_sbuf->sgetc() both before and after the change. However we do not >> modify _M_sbuf >> in this case. For non-empty range, we replace call to _M_sbuf->sbumpc() >> with >> _M_sbuf->snextc() in pre-increment, and extract the check again from *it >> to >> ++it. However, as _M_sbuf is now cleared during increment, so last it != >> end >> check avoids _M_sbuf->sgetc() call to check against EoF. >> >> However, this change impact the behavior of the post-increment (*it++), as >> we now load both current character (for return value) and next character >> (to >> check against EoF). In consequence we call both sgetc() and snextc(), in >> contrast >> to previous single sbumpc() call. >> >> PR libstdc++/105580 >> >> libstdc++-v3/ChangeLog: >> >> * include/bits/streambuf_iterator.h >> (istreambuf_iterator::_M_sbuf): >> Remove mutable and adjust whitespace. >> (istreambuf_iterator::_M_c): Adjust whitespace. >> (istreambuf_iterator::operator++()): Clear _M_sbuf if next >> character >> is EoF. >> (istreambuf_iterator::operator++(int)): Use _M_sbuf->sgetc() to >> load current character, and define in terms of __*this. >> (istreambuf_iterator::_M_get()): Do not clear _M_sbuf in case of >> EoF. >> * testsuite/24_iterators/istreambuf_iterator/2.cc: Test for using >> multiple iterators to same rdbuf. >> --- >> This warning turned out to be false-positive (as explained in the commit >> decription), so we can just add add pragrams to address that. But this >> change seem to be beneficial anyway, as we are no longer causing >> data-races. >> >> Tested on x86_64-linux. Locally tested with -Wnull-dereference, >> and the istream_iterator warning in 24_iterators/istreambuf_iterator/2.cc >> disappears (there are some other warning present I haven't looked into >> that yet). >> >> .../include/bits/streambuf_iterator.h | 20 +++---- >> .../24_iterators/istreambuf_iterator/2.cc | 58 +++++++++++++++++++ >> 2 files changed, 66 insertions(+), 12 deletions(-) >> >> diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h >> b/libstdc++-v3/include/bits/streambuf_iterator.h >> index 93d3dd24f93..095928ca4d8 100644 >> --- a/libstdc++-v3/include/bits/streambuf_iterator.h >> +++ b/libstdc++-v3/include/bits/streambuf_iterator.h >> @@ -112,8 +112,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >> // the "end of stream" iterator value. >> // NB: This implementation assumes the "end of stream" value >> // is EOF, or -1. >> - mutable streambuf_type* _M_sbuf; >> - int_type _M_c; >> + streambuf_type* _M_sbuf; >> + int_type _M_c; >> >> public: >> /// Construct end of input stream iterator. >> @@ -172,7 +172,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >> >> _M_message(__gnu_debug::__msg_inc_istreambuf) >> ._M_iterator(*this)); >> >> - _M_sbuf->sbumpc(); >> + if (_S_is_eof(_M_sbuf->snextc())) >> + _M_sbuf = 0; >> _M_c = traits_type::eof(); >> return *this; >> } >> @@ -181,14 +182,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >> istreambuf_iterator >> operator++(int) >> { >> - __glibcxx_requires_cond(_M_sbuf && >> - (!_S_is_eof(_M_c) || >> !_S_is_eof(_M_sbuf->sgetc())), >> - >> _M_message(__gnu_debug::__msg_inc_istreambuf) >> - ._M_iterator(*this)); >> - >> istreambuf_iterator __old = *this; >> - __old._M_c = _M_sbuf->sbumpc(); >> - _M_c = traits_type::eof(); >> + __old._M_c = _M_sbuf->sgetc(); >> + ++*this; >> return __old; >> } >> >> @@ -206,8 +202,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >> _M_get() const >> { >> int_type __ret = _M_c; >> - if (_M_sbuf && _S_is_eof(__ret) && _S_is_eof(__ret = >> _M_sbuf->sgetc())) >> - _M_sbuf = 0; >> + if (_M_sbuf && _S_is_eof(__ret)) >> + __ret = _M_sbuf->sgetc(); >> > The benefit seem to be result of eliminating the if condition here, as we are now having one less eof check per loop, previously we got: O: 4: 2x2 in _M_get per operator* and equal N: 3: 2x1 in _M_get per operator* and equal, and one in operator++ > return __ret; >> } >> >> diff --git a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc >> b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc >> index 78701d71cee..0318a677f35 100644 >> --- a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc >> +++ b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc >> @@ -109,8 +109,66 @@ void test02(void) >> } >> } >> >> +void >> +test_empty() >> +{ >> + std::istreambuf_iterator<char> null(0), end; >> + VERIFY( null == end ); >> + >> + std::istringstream ess; >> + // Thie specification hear seem to indicate, that such iterators >> + // are not end-of-stream iterators, as rdbuf pointer is nullptr, >> + // however we treat them as such, as otherwise we would produce >> + // a range containing single eof character. >> + std::istreambuf_iterator<char> it1(ess), it2(ess); >> + VERIFY( it1 == end ); >> + VERIFY( it2 == end ); >> +} >> + >> +void >> +test_multi() >> +{ >> + // C++98 (and later) define operator* in [istreambuf.iterator.ops] as: >> + // Returns: The character obtained via the streambuf member >> sbuf_->sgetc(). >> + // This nails down behavior of mulptiple iterators to same sequence. >> + std::istringstream iss("abcd"); >> + std::istreambuf_iterator<char> it1(iss), it2(iss), end; >> + >> + VERIFY( it1 != end ); >> + VERIFY( it2 != end ); >> + VERIFY( *it1 == 'a' ); >> + VERIFY( *it2 == 'a' ); >> + ++it1; >> + >> + VERIFY( it1 != end ); >> + VERIFY( it2 != end ); >> + VERIFY( *it1 == 'b' ); >> + VERIFY( *it2 == 'b' ); >> + ++it2; >> + >> + VERIFY( it1 != end ); >> + VERIFY( it2 != end ); >> + VERIFY( *it1 == 'c' ); >> + VERIFY( *it2 == 'c' ); >> + ++it2; >> + >> + VERIFY( it1 != end ); >> + VERIFY( it2 != end ); >> + VERIFY( *it1 == 'd' ); >> + VERIFY( *it2 == 'd' ); >> + // second dereference >> + VERIFY( *it1 == 'd' ); >> + VERIFY( *it2 == 'd' ); >> + ++it1; >> + >> + VERIFY( it1 == end ); >> + VERIFY( it2 == end ); >> +} >> + >> int main() >> { >> test02(); >> + test_empty(); >> + test_multi(); >> return 0; >> } >> -- >> 2.52.0 >> >>
