On Thu, Jan 8, 2026 at 11:52 AM Tomasz Kaminski <[email protected]> wrote:

>
>
> On Thu, Jan 8, 2026 at 10:55 AM Tomasz Kaminski <[email protected]>
> wrote:
>
>>
>>
>> On Thu, Jan 8, 2026 at 10:38 AM Tomasz Kaminski <[email protected]>
>> wrote:
>>
>>> I have performed some microbenchmarking, on following example:
>>> * bench_empty: input is empty istreamstream, creates iterator and
>>> compares against end
>>> * bench_stream_create: only resetting stringstream with new content
>>> * bench_preinc: for loop over using pre increment
>>> * bench_postinc: for loop over using post increment
>>> * bench_deref_posting: while with *it++
>>>
>>> This are the result I have observed, for the changes with the patch:
>>> --------------------------------------------------------------
>>> Benchmark                    Time             CPU   Iterations
>>> --------------------------------------------------------------
>>> bench_empty               2.10 ns         2.09 ns    337921434
>>> bench_stream_create       10.6 ns         10.6 ns     65736840
>>> bench_preinc               419 ns          419 ns      1662732
>>> bench_postinc              436 ns          435 ns      1609828
>>> bench_deref_postinc        306 ns          305 ns      22866
>>>
>>> And the result before the patch:
>>> --------------------------------------------------------------
>>> Benchmark                    Time             CPU   Iterations
>>> --------------------------------------------------------------
>>> bench_empty               2.27 ns         2.26 ns    314944156
>>> bench_stream_create       11.7 ns         11.6 ns     60679957
>>> bench_preinc               873 ns          871 ns       807352
>>> bench_postinc              920 ns          917 ns       751565
>>> bench_deref_postinc        706 ns          704 ns      1038942
>>>
>>
>>> For sanity, I have also check the result on g++ 15.2 that ships with my
>>> system:
>>> --------------------------------------------------------------
>>> Benchmark                    Time             CPU   Iterations
>>> --------------------------------------------------------------
>>> bench_empty               2.30 ns         2.30 ns    303070664
>>> bench_stream_create       10.6 ns         10.6 ns     65778662
>>> bench_preinc               767 ns          765 ns       915085
>>> bench_postinc              767 ns          764 ns       913985
>>> bench_deref_postinc        675 ns          673 ns      1107772
>>>
>> The times reported when using __proxy as return type for post-increment,
without providing conversion
to istream_iterator:
--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
bench_empty               1.80 ns         1.79 ns    398182529
bench_stream_create       10.6 ns         10.6 ns     65450500
bench_preinc               314 ns          314 ns      2231714
bench_postinc              360 ns          359 ns      2029998
bench_deref_postinc        315 ns          313 ns      2235176


>
>>> So it looks like eliminating save to member from both equal and
>>> operator* makes implementation
>>> two times faster. I will look into that.
>>>
>>> Attaching also my benchmark file.
>>>
>>> Regards,
>>> Tomasz
>>>
>>> On Wed, Jan 7, 2026 at 3:41 PM Tomasz Kamiński <[email protected]>
>>> wrote:
>>>
>>>> The warning was produced by following seqeunce, given an
>>>> istream_iterator<char>
>>>> it, such that *it will result in hitting EoF in it->_M_get(), and thus
>>>> clearing
>>>> _M_sbuf, the subsequent call to ++it, will result in _M_sbuf->sbumpc()
>>>> call
>>>> on null pointer dereference. This is however an false-positive, as in
>>>> such sitution
>>>> it == istream_iteator() returns true, and the iterator should not be
>>>> dereferenced
>>>> in first place.
>>>>
>>>> This patch addresses the above by clearing the _M_sbuf in operator++,
>>>> instead of
>>>> _M_get(). This removes the need for making _M_sbuf mutable, and thus
>>>> make the
>>>> implementation conforming with regards to C++11 [res.on.data.races] p3.
>>>>
>>>> This change should not or positive have performance impact on the usual
>>>> iteration
>>>> patterns, in form:
>>>>  while (it != end) { process(*it); ++it; }
>>>> In case when it is end-of-stream iterator, the it != end returns in one
>>>> call of
>>>> _M_sbuf->sgetc() both before and after the change. However we do not
>>>> modify _M_sbuf
>>>> in this case. For non-empty range, we replace call to _M_sbuf->sbumpc()
>>>> with
>>>> _M_sbuf->snextc() in pre-increment, and extract the check again from
>>>> *it to
>>>> ++it. However, as _M_sbuf is now cleared during increment, so last it
>>>> != end
>>>> check avoids _M_sbuf->sgetc() call to check against EoF.
>>>>
>>>> However, this change impact the behavior of the post-increment (*it++),
>>>> as
>>>> we now load both current character (for return value) and next
>>>> character (to
>>>> check against EoF). In consequence we call both sgetc() and snextc(),
>>>> in contrast
>>>> to previous single sbumpc() call.
>>>>
>>>>         PR libstdc++/105580
>>>>
>>>> libstdc++-v3/ChangeLog:
>>>>
>>>>         * include/bits/streambuf_iterator.h
>>>> (istreambuf_iterator::_M_sbuf):
>>>>         Remove mutable and adjust whitespace.
>>>>         (istreambuf_iterator::_M_c): Adjust whitespace.
>>>>         (istreambuf_iterator::operator++()): Clear _M_sbuf if next
>>>> character
>>>>         is EoF.
>>>>         (istreambuf_iterator::operator++(int)): Use _M_sbuf->sgetc() to
>>>>         load current character, and define in terms of __*this.
>>>>         (istreambuf_iterator::_M_get()): Do not clear _M_sbuf in case
>>>> of EoF.
>>>>         * testsuite/24_iterators/istreambuf_iterator/2.cc: Test for
>>>> using
>>>>         multiple iterators to same rdbuf.
>>>> ---
>>>> This warning turned out to be false-positive (as explained in the commit
>>>> decription), so we can just add add pragrams to address that. But this
>>>> change seem to be beneficial anyway, as we are no longer causing
>>>> data-races.
>>>>
>>>> Tested on x86_64-linux. Locally tested with -Wnull-dereference,
>>>> and the istream_iterator warning in
>>>> 24_iterators/istreambuf_iterator/2.cc
>>>> disappears (there are some other warning present I haven't looked into
>>>> that yet).
>>>>
>>>>  .../include/bits/streambuf_iterator.h         | 20 +++----
>>>>  .../24_iterators/istreambuf_iterator/2.cc     | 58 +++++++++++++++++++
>>>>  2 files changed, 66 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h
>>>> b/libstdc++-v3/include/bits/streambuf_iterator.h
>>>> index 93d3dd24f93..095928ca4d8 100644
>>>> --- a/libstdc++-v3/include/bits/streambuf_iterator.h
>>>> +++ b/libstdc++-v3/include/bits/streambuf_iterator.h
>>>> @@ -112,8 +112,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>>>        // the "end of stream" iterator value.
>>>>        // NB: This implementation assumes the "end of stream" value
>>>>        // is EOF, or -1.
>>>> -      mutable streambuf_type*  _M_sbuf;
>>>> -      int_type                 _M_c;
>>>> +      streambuf_type* _M_sbuf;
>>>> +      int_type        _M_c;
>>>>
>>>>      public:
>>>>        ///  Construct end of input stream iterator.
>>>> @@ -172,7 +172,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>>>
>>>> _M_message(__gnu_debug::__msg_inc_istreambuf)
>>>>                                 ._M_iterator(*this));
>>>>
>>>> -       _M_sbuf->sbumpc();
>>>> +       if (_S_is_eof(_M_sbuf->snextc()))
>>>> +         _M_sbuf = 0;
>>>>         _M_c = traits_type::eof();
>>>>         return *this;
>>>>        }
>>>> @@ -181,14 +182,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>>>        istreambuf_iterator
>>>>        operator++(int)
>>>>        {
>>>> -       __glibcxx_requires_cond(_M_sbuf &&
>>>> -                               (!_S_is_eof(_M_c) ||
>>>> !_S_is_eof(_M_sbuf->sgetc())),
>>>> -
>>>>  _M_message(__gnu_debug::__msg_inc_istreambuf)
>>>> -                               ._M_iterator(*this));
>>>> -
>>>>         istreambuf_iterator __old = *this;
>>>> -       __old._M_c = _M_sbuf->sbumpc();
>>>> -       _M_c = traits_type::eof();
>>>> +       __old._M_c = _M_sbuf->sgetc();
>>>> +       ++*this;
>>>>         return __old;
>>>>        }
>>>>
>>>> @@ -206,8 +202,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>>>        _M_get() const
>>>>        {
>>>>         int_type __ret = _M_c;
>>>> -       if (_M_sbuf && _S_is_eof(__ret) && _S_is_eof(__ret =
>>>> _M_sbuf->sgetc()))
>>>> -         _M_sbuf = 0;
>>>> +       if (_M_sbuf && _S_is_eof(__ret))
>>>> +         __ret = _M_sbuf->sgetc();
>>>>
>>> The benefit seem to be result of eliminating the if condition here, as
>> we are now having
>> one less eof check per loop, previously we got:
>> O: 4: 2x2 in _M_get per operator* and equal
>> N: 3: 2x1 in _M_get per operator* and equal, and one in operator++
>>
> We could also eliminate the  _S_is_eof(__ret) check, by returning __proxy
> instead of istream_iterator
> for pre-increment, and eliminate uses of _M_c completely. This will
> however break already non-portable
> code that uses:
>   istream_iteartor i;
>   istream_iteartor i2 = i++;
> We cannot provide the conversion operator, as this would require
> reintroducing S_is_eof check.
>
> I have prototyped that, and there is a measurable benefit from doing it,
> but I think something that we should
> do in stage 1 for GCC 17.
>
>         return __ret;
>>>>        }
>>>>
>>>> diff --git
>>>> a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>>>> b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>>>> index 78701d71cee..0318a677f35 100644
>>>> --- a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>>>> +++ b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
>>>> @@ -109,8 +109,66 @@ void test02(void)
>>>>      }
>>>>  }
>>>>
>>>> +void
>>>> +test_empty()
>>>> +{
>>>> +  std::istreambuf_iterator<char> null(0), end;
>>>> +  VERIFY( null == end );
>>>> +
>>>> +  std::istringstream ess;
>>>> +  // Thie specification hear seem to indicate, that such iterators
>>>> +  // are not end-of-stream iterators, as rdbuf pointer is nullptr,
>>>> +  // however we treat them as such, as otherwise we would produce
>>>> +  // a range containing single eof character.
>>>> +  std::istreambuf_iterator<char> it1(ess), it2(ess);
>>>> +  VERIFY( it1 == end );
>>>> +  VERIFY( it2 == end );
>>>> +}
>>>> +
>>>> +void
>>>> +test_multi()
>>>> +{
>>>> +  // C++98 (and later) define operator* in [istreambuf.iterator.ops]
>>>> as:
>>>> +  // Returns: The character obtained via the streambuf member
>>>> sbuf_->sgetc().
>>>> +  // This nails down behavior of mulptiple iterators to same sequence.
>>>> +  std::istringstream iss("abcd");
>>>> +  std::istreambuf_iterator<char> it1(iss), it2(iss), end;
>>>> +
>>>> +  VERIFY( it1 != end );
>>>> +  VERIFY( it2 != end );
>>>> +  VERIFY( *it1 == 'a' );
>>>> +  VERIFY( *it2 == 'a' );
>>>> +  ++it1;
>>>> +
>>>> +  VERIFY( it1 != end );
>>>> +  VERIFY( it2 != end );
>>>> +  VERIFY( *it1 == 'b' );
>>>> +  VERIFY( *it2 == 'b' );
>>>> +  ++it2;
>>>> +
>>>> +  VERIFY( it1 != end );
>>>> +  VERIFY( it2 != end );
>>>> +  VERIFY( *it1 == 'c' );
>>>> +  VERIFY( *it2 == 'c' );
>>>> +  ++it2;
>>>> +
>>>> +  VERIFY( it1 != end );
>>>> +  VERIFY( it2 != end );
>>>> +  VERIFY( *it1 == 'd' );
>>>> +  VERIFY( *it2 == 'd' );
>>>> +  // second dereference
>>>> +  VERIFY( *it1 == 'd' );
>>>> +  VERIFY( *it2 == 'd' );
>>>> +  ++it1;
>>>> +
>>>> +  VERIFY( it1 == end );
>>>> +  VERIFY( it2 == end );
>>>> +}
>>>> +
>>>>  int main()
>>>>  {
>>>>    test02();
>>>> +  test_empty();
>>>> +  test_multi();
>>>>    return 0;
>>>>  }
>>>> --
>>>> 2.52.0
>>>>
>>>>

Reply via email to