On Thu, Jan 8, 2026 at 10:38 AM Tomasz Kaminski <[email protected]> wrote:

> I have performed some microbenchmarking, on following example:
> * bench_empty: input is empty istreamstream, creates iterator and compares
> against end
> * bench_stream_create: only resetting stringstream with new content
> * bench_preinc: for loop over using pre increment
> * bench_postinc: for loop over using post increment
> * bench_deref_posting: while with *it++
>
> This are the result I have observed, for the changes with the patch:
> --------------------------------------------------------------
> Benchmark                    Time             CPU   Iterations
> --------------------------------------------------------------
> bench_empty               2.10 ns         2.09 ns    337921434
> bench_stream_create       10.6 ns         10.6 ns     65736840
> bench_preinc               419 ns          419 ns      1662732
> bench_postinc              436 ns          435 ns      1609828
> bench_deref_postinc        306 ns          305 ns      22866
>
> And the result before the patch:
> --------------------------------------------------------------
> Benchmark                    Time             CPU   Iterations
> --------------------------------------------------------------
> bench_empty               2.27 ns         2.26 ns    314944156
> bench_stream_create       11.7 ns         11.6 ns     60679957
> bench_preinc               873 ns          871 ns       807352
> bench_postinc              920 ns          917 ns       751565
> bench_deref_postinc        706 ns          704 ns      1038942
>
> For sanity, I have also check the result on g++ 15.2 that ships with my
> system:
> --------------------------------------------------------------
> Benchmark                    Time             CPU   Iterations
> --------------------------------------------------------------
> bench_empty               2.30 ns         2.30 ns    303070664
> bench_stream_create       10.6 ns         10.6 ns     65778662
> bench_preinc               767 ns          765 ns       915085
> bench_postinc              767 ns          764 ns       913985
> bench_deref_postinc        675 ns          673 ns      1107772
>
> So it looks like eliminating save to member from both equal and operator*
> makes implementation
> two times faster. I will look into that.
>
> Attaching also my benchmark file.
>
> Regards,
> Tomasz
>
> On Wed, Jan 7, 2026 at 3:41 PM Tomasz Kamiński <[email protected]>
> wrote:
>
>> The warning was produced by following seqeunce, given an
>> istream_iterator<char>
>> it, such that *it will result in hitting EoF in it->_M_get(), and thus
>> clearing
>> _M_sbuf, the subsequent call to ++it, will result in _M_sbuf->sbumpc()
>> call
>> on null pointer dereference. This is however an false-positive, as in
>> such sitution
>> it == istream_iteator() returns true, and the iterator should not be
>> dereferenced
>> in first place.
>>
>> This patch addresses the above by clearing the _M_sbuf in operator++,
>> instead of
>> _M_get(). This removes the need for making _M_sbuf mutable, and thus make
>> the
>> implementation conforming with regards to C++11 [res.on.data.races] p3.
>>
>> This change should not or positive have performance impact on the usual
>> iteration
>> patterns, in form:
>>  while (it != end) { process(*it); ++it; }
>> In case when it is end-of-stream iterator, the it != end returns in one
>> call of
>> _M_sbuf->sgetc() both before and after the change. However we do not
>> modify _M_sbuf
>> in this case. For non-empty range, we replace call to _M_sbuf->sbumpc()
>> with
>> _M_sbuf->snextc() in pre-increment, and extract the check again from *it
>> to
>> ++it. However, as _M_sbuf is now cleared during increment, so last it !=
>> end
>> check avoids _M_sbuf->sgetc() call to check against EoF.
>>
>> However, this change impact the behavior of the post-increment (*it++), as
>> we now load both current character (for return value) and next character
>> (to
>> check against EoF). In consequence we call both sgetc() and snextc(), in
>> contrast
>> to previous single sbumpc() call.
>>
>>         PR libstdc++/105580
>>
>> libstdc++-v3/ChangeLog:
>>
>>         * include/bits/streambuf_iterator.h
>> (istreambuf_iterator::_M_sbuf):
>>         Remove mutable and adjust whitespace.
>>         (istreambuf_iterator::_M_c): Adjust whitespace.
>>         (istreambuf_iterator::operator++()): Clear _M_sbuf if next
>> character
>>         is EoF.
>>         (istreambuf_iterator::operator++(int)): Use _M_sbuf->sgetc() to
>>         load current character, and define in terms of __*this.
>>         (istreambuf_iterator::_M_get()): Do not clear _M_sbuf in case of
>> EoF.
>>         * testsuite/24_iterators/istreambuf_iterator/2.cc: Test for using
>>         multiple iterators to same rdbuf.
>> ---
>> This warning turned out to be false-positive (as explained in the commit
>> decription), so we can just add add pragrams to address that. But this
>> change seem to be beneficial anyway, as we are no longer causing
>> data-races.
>>
>> Tested on x86_64-linux. Locally tested with -Wnull-dereference,
>> and the istream_iterator warning in 24_iterators/istreambuf_iterator/2.cc
>> disappears (there are some other warning present I haven't looked into
>> that yet).
>>
>>  .../include/bits/streambuf_iterator.h         | 20 +++----
>>  .../24_iterators/istreambuf_iterator/2.cc     | 58 +++++++++++++++++++
>>  2 files changed, 66 insertions(+), 12 deletions(-)
>>
>> diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h
>> b/libstdc++-v3/include/bits/streambuf_iterator.h
>> index 93d3dd24f93..095928ca4d8 100644
>> --- a/libstdc++-v3/include/bits/streambuf_iterator.h
>> +++ b/libstdc++-v3/include/bits/streambuf_iterator.h
>> @@ -112,8 +112,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>        // the "end of stream" iterator value.
>>        // NB: This implementation assumes the "end of stream" value
>>        // is EOF, or -1.
>> -      mutable streambuf_type*  _M_sbuf;
>> -      int_type                 _M_c;
>> +      streambuf_type* _M_sbuf;
>> +      int_type        _M_c;
>>
>>      public:
>>        ///  Construct end of input stream iterator.
>> @@ -172,7 +172,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>
>> _M_message(__gnu_debug::__msg_inc_istreambuf)
>>                                 ._M_iterator(*this));
>>
>> -       _M_sbuf->sbumpc();
>> +       if (_S_is_eof(_M_sbuf->snextc()))
>> +         _M_sbuf = 0;
>>         _M_c = traits_type::eof();
>>         return *this;
>>        }
>> @@ -181,14 +182,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>        istreambuf_iterator
>>        operator++(int)
>>        {
>> -       __glibcxx_requires_cond(_M_sbuf &&
>> -                               (!_S_is_eof(_M_c) ||
>> !_S_is_eof(_M_sbuf->sgetc())),
>> -
>>  _M_message(__gnu_debug::__msg_inc_istreambuf)
>> -                               ._M_iterator(*this));
>> -
>>         istreambuf_iterator __old = *this;
>> -       __old._M_c = _M_sbuf->sbumpc();
>> -       _M_c = traits_type::eof();
>> +       __old._M_c = _M_sbuf->sgetc();
>> +       ++*this;
>>         return __old;
>>        }
>>
>> @@ -206,8 +202,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>        _M_get() const
>>        {
>>         int_type __ret = _M_c;
>> -       if (_M_sbuf && _S_is_eof(__ret) && _S_is_eof(__ret =
>> _M_sbuf->sgetc()))
>> -         _M_sbuf = 0;
>> +       if (_M_sbuf && _S_is_eof(__ret))
>> +         __ret = _M_sbuf->sgetc();
>>
> The benefit seem to be result of eliminating the if condition here, as we
are now having
one less eof check per loop, previously we got:
O: 4: 2x2 in _M_get per operator* and equal
N: 3: 2x1 in _M_get per operator* and equal, and one in operator++

>         return __ret;
>>        }
>>
>> diff --git a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>> b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>> index 78701d71cee..0318a677f35 100644
>> --- a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>> +++ b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>> @@ -109,8 +109,66 @@ void test02(void)
>>      }
>>  }
>>
>> +void
>> +test_empty()
>> +{
>> +  std::istreambuf_iterator<char> null(0), end;
>> +  VERIFY( null == end );
>> +
>> +  std::istringstream ess;
>> +  // Thie specification hear seem to indicate, that such iterators
>> +  // are not end-of-stream iterators, as rdbuf pointer is nullptr,
>> +  // however we treat them as such, as otherwise we would produce
>> +  // a range containing single eof character.
>> +  std::istreambuf_iterator<char> it1(ess), it2(ess);
>> +  VERIFY( it1 == end );
>> +  VERIFY( it2 == end );
>> +}
>> +
>> +void
>> +test_multi()
>> +{
>> +  // C++98 (and later) define operator* in [istreambuf.iterator.ops] as:
>> +  // Returns: The character obtained via the streambuf member
>> sbuf_->sgetc().
>> +  // This nails down behavior of mulptiple iterators to same sequence.
>> +  std::istringstream iss("abcd");
>> +  std::istreambuf_iterator<char> it1(iss), it2(iss), end;
>> +
>> +  VERIFY( it1 != end );
>> +  VERIFY( it2 != end );
>> +  VERIFY( *it1 == 'a' );
>> +  VERIFY( *it2 == 'a' );
>> +  ++it1;
>> +
>> +  VERIFY( it1 != end );
>> +  VERIFY( it2 != end );
>> +  VERIFY( *it1 == 'b' );
>> +  VERIFY( *it2 == 'b' );
>> +  ++it2;
>> +
>> +  VERIFY( it1 != end );
>> +  VERIFY( it2 != end );
>> +  VERIFY( *it1 == 'c' );
>> +  VERIFY( *it2 == 'c' );
>> +  ++it2;
>> +
>> +  VERIFY( it1 != end );
>> +  VERIFY( it2 != end );
>> +  VERIFY( *it1 == 'd' );
>> +  VERIFY( *it2 == 'd' );
>> +  // second dereference
>> +  VERIFY( *it1 == 'd' );
>> +  VERIFY( *it2 == 'd' );
>> +  ++it1;
>> +
>> +  VERIFY( it1 == end );
>> +  VERIFY( it2 == end );
>> +}
>> +
>>  int main()
>>  {
>>    test02();
>> +  test_empty();
>> +  test_multi();
>>    return 0;
>>  }
>> --
>> 2.52.0
>>
>>

Reply via email to