On 06/08/20 14:14 +0100, Jonathan Wakely wrote:
On 05/08/20 16:31 -0600, Martin Sebor via Libstdc++ wrote:
On 8/5/20 3:25 PM, Jonathan Wakely wrote:
P0487R1 resolved LWG 2499 for C++20 by removing the operator>> overloads
that have high risk of buffer overflows. They were replaced by
equivalents that only accept a reference to an array, and so can
guarantee not to write past the end of the array.

In order to support both the old and new functionality, this patch
introduces a new overloaded __istream_extract function which takes a
maximum length. The new operator>> overloads use the array size as the
maximum length. The old overloads now use __builtin_object_size to
determine the available buffer size if available (which requires -O2) or
use numeric_limits<streamsize>::max()/sizeof(char_type) otherwise. This
is a change in behaviour, as the old overloads previously always used
numeric_limits<streamsize>::max(), without considering sizeof(char_type)
and without attempting to prevent overflows.

Because they now do little more than call __istream_extract, the old
operator>> overloads are very small inline functions. This means there
is no advantage to explicitly instantiating them in the library (in fact
that would prevent the __builtin_object_size checks from ever working).
As a result, the explicit instantiation declarations can be removed from
the header. The explicit instantiation definitions are still needed, for
backwards compatibility with existing code that expects to link to the
definitions in the library.

While working on this change I noticed that src/c++11/istream-inst.cc
has the following explicit instantiation definition:
 template istream& operator>>(istream&, char*);
This had no effect (and so should not have been present in that file),
because there was an explicit specialization declared in <istream> and
defined in src/++98/istream.cc. However, this change removes the
explicit specialization, and now the explicit instantiation definition
is necessary to ensure the symbol gets defined in the library.

libstdc++-v3/ChangeLog:

        * config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Export new symbols.
        * include/bits/istream.tcc (__istream_extract): New function
        template implementing both of operator>>(istream&, char*) and
        operator>>(istream&, char(&)[N]). Add explicit instantiation
        declaration for it. Remove explicit instantiation declarations
        for old function templates.
        * include/std/istream (__istream_extract): Declare.
        (operator>>(basic_istream<C,T>&, C*)): Define inline and simply
        call __istream_extract.
        (operator>>(basic_istream<char,T>&, signed char*)): Likewise.
        (operator>>(basic_istream<char,T>&, unsigned char*)): Likewise.
        (operator>>(basic_istream<C,T>&, C(7)[N])): Define for LWG 2499.
        (operator>>(basic_istream<char,T>&, signed char(&)[N])):
        Likewise.
        (operator>>(basic_istream<char,T>&, unsigned char(&)[N])):
        Likewise.
        * include/std/streambuf (basic_streambuf): Declare char overload
        of __istream_extract as a friend.
        * src/c++11/istream-inst.cc: Add explicit instantiation
        definition for wchar_t overload of __istream_extract. Remove
        explicit instantiation definitions of old operator>> overloads
        for versioned-namespace build.
        * src/c++98/istream.cc (operator>>(istream&, char*)): Replace
        with __istream_extract(istream&, char*, streamsize).
        * testsuite/27_io/basic_istream/extractors_character/char/3.cc:
        Do not use variable-length array.
        * testsuite/27_io/basic_istream/extractors_character/char/4.cc:
        Do not run test for C++20.
        * testsuite/27_io/basic_istream/extractors_character/char/9555-ic.cc:
        Do not test writing to pointers for C++20.
        * testsuite/27_io/basic_istream/extractors_character/char/9826.cc:
        Use array instead of pointer.
        * testsuite/27_io/basic_istream/extractors_character/wchar_t/3.cc:
        Do not use variable-length array.
        * testsuite/27_io/basic_istream/extractors_character/wchar_t/4.cc:
        Do not run test for C++20.
        * testsuite/27_io/basic_istream/extractors_character/wchar_t/9555-ic.cc:
        Do not test writing to pointers for C++20.
        * testsuite/27_io/basic_istream/extractors_character/char/lwg2499.cc:
        New test.
        * 
testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc:
        New test.
        * testsuite/27_io/basic_istream/extractors_character/char/overflow.cc:
        New test.
        * testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499.cc:
        New test.
        * 
testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc:
        New test.

Tested powerpc64le-linux. Committed to trunk.

Martin, Jakub, could you please double-check the usage of
__builtin_object_size? (line 805 in libstdc++-v3/include/std/istream)
Do you see any problems with using it here? If it can't tell the size
then we just assume it's larger than the string to be extracted, which
is what the old code did anyway. I do have an idea for stronger
checking in debug mode, which I'll post in a minute.

I've always found the second argument to __builtin_object_size
confusing for types above 1.  I don't see anything wrong in
the diff but I believe the most useful results are with type 1
for string functions and type 0 for raw memory functions like
memcpy (that's what _FORTIFY_SOURCE uses for the two sets of
functions).  In type 2 when the result is zero it means one of
two things: either the size of the array couldn't be determined
or it really is zero.  That's less than helpful in cases like:

char a[8];
strcpy (a + 8, s);

where it prevents detecting the buffer overflow.

I would suggest to use type 1 in the iostream functions instead.

Adding attribute access to the __istream_extract overloads should
also let GCC check for out-of-bounds accesses by the function and
issue warnings.  This goes beyond what __builtin_object_size makes
possible (it considers also variable sizes in known ranges).

I tried the attached patch with a testcase like:

#include <istream>

void
test01(std::istream& in)
{
 char pc[1];
 in >> (pc + 1); // { dg-warning "writing 1 byte into a region of size 0" }
}

I get a -Wstringop-overflow warning with -O0, but not at -O2. Is that
because the call to operator>> gets inlined, and __builtin_object_size
returns 0, and we call __istream_extract(in, s, 0) which says we won't
write anything, so it's now considered safe?

That's unfortunate, because actually __istream_extract will always
write at least one byte as a null terminator.

So I tried various ways toi try and get the warning back at -O2 and
nothing worked e.g. when __builtin_object_size(__s, 0) returns 0 I
tried calling __istream_extract(__in, __s, 1) which should say I'm
going to write a character, and so should warn if the size is known to
be zero. But no wanring.

So now I'm considering this:

 template<typename _CharT, typename _Traits>
   __attribute__((__nonnull__(2), __access__(__write_only__, 2)))
   inline basic_istream<_CharT, _Traits>&
   operator>>(basic_istream<_CharT, _Traits>& __in, _CharT* __s)
   {
     size_t __n = __builtin_object_size(__s, 0);
     if (__builtin_expect(__n < sizeof(_CharT), false))
        {
          // not even space for null terminator
          __glibcxx_assert(__n >= sizeof(_CharT));
          __in.width(0);
          __in.setstate(ios_base::failbit);
        }
     else
        {
          if (__n == (size_t)-1)
            __n = __gnu_cxx::__numeric_traits<streamsize>::__max;
          std::__istream_extract(__in, __s, __n / sizeof(_CharT));
        }
     return __in;
   }

This will give a -Wstringop-overflow warning at -O0 and then overflow
the buffer, with undefined behaviour. And it will give no warning but
avoid the overflow when optimising. This isn't my preferred outcome,
I'd prefer to always get a warning, *and* be able to avoid the
overflow when optimising and the size is known.

The version above is what I've pushed (also attached).

Tested powerpc64le-linux.



commit 6251ea15f55ec57d6325c2e37e88b22315aba658
Author: Jonathan Wakely <jwak...@redhat.com>
Date:   Thu Aug 6 16:16:33 2020

    libstdc++: Adjust overflow prevention to operator>>
    
    This adjusts the overflow prevention added to operator>> so that we can
    distinguish "unknown size" from "zero size", and avoid writing anything
    at all in to zero sized buffers.
    
    This also removes the incorrect comment saying extraction stops at a
    null byte.
    
    libstdc++-v3/ChangeLog:
    
            * include/std/istream (operator>>(istream&, char*)): Add
            attributes to get warnings for pointers that are null or known
            to point to the end of a buffer. Request upper bound from
            __builtin_object_size check and handle zero-sized buffer case.
            (operator>>(istream&, signed char))
            (operator>>(istream&, unsigned char*)): Add attributes.
            * testsuite/27_io/basic_istream/extractors_character/char/overflow.cc:
            Check extracting into the middle of a buffer.
            * testsuite/27_io/basic_istream/extractors_character/wchar_t/overflow.cc: New test.

diff --git a/libstdc++-v3/include/std/istream b/libstdc++-v3/include/std/istream
index cb8e9f87c90..20a455a0ef1 100644
--- a/libstdc++-v3/include/std/istream
+++ b/libstdc++-v3/include/std/istream
@@ -790,7 +790,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
    *  - `n - 1` characters are stored
    *  - EOF is reached
    *  - the next character is whitespace according to the current locale
-   *  - the next character is a null byte (i.e., `charT()`)
    *
    *  `width(0)` is then called for the input stream.
    *
@@ -799,25 +798,38 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus <= 201703L
   template<typename _CharT, typename _Traits>
+    __attribute__((__nonnull__(2), __access__(__write_only__, 2)))
     inline basic_istream<_CharT, _Traits>&
     operator>>(basic_istream<_CharT, _Traits>& __in, _CharT* __s)
     {
-      streamsize __n = __builtin_object_size(__s, 2) / sizeof(_CharT);
-      if (__n == 0)
-	__n = __gnu_cxx::__numeric_traits<streamsize>::__max / sizeof(_CharT);
-      std::__istream_extract(__in, __s, __n);
+      size_t __n = __builtin_object_size(__s, 0);
+      if (__builtin_expect(__n < sizeof(_CharT), false))
+	{
+	  // There is not even space for the required null terminator.
+	  __glibcxx_assert(__n >= sizeof(_CharT));
+	  __in.width(0);
+	  __in.setstate(ios_base::failbit);
+	}
+      else
+	{
+	  if (__n == (size_t)-1)
+	    __n = __gnu_cxx::__numeric_traits<streamsize>::__max;
+	  std::__istream_extract(__in, __s, __n / sizeof(_CharT));
+	}
       return __in;
     }
 
   template<class _Traits>
+    __attribute__((__nonnull__(2), __access__(__write_only__, 2)))
     inline basic_istream<char, _Traits>&
     operator>>(basic_istream<char, _Traits>& __in, unsigned char* __s)
-    { return (__in >> reinterpret_cast<char*>(__s)); }
+    { return __in >> reinterpret_cast<char*>(__s); }
 
   template<class _Traits>
+    __attribute__((__nonnull__(2), __access__(__write_only__, 2)))
     inline basic_istream<char, _Traits>&
     operator>>(basic_istream<char, _Traits>& __in, signed char* __s)
-    { return (__in >> reinterpret_cast<char*>(__s)); }
+    { return __in >> reinterpret_cast<char*>(__s); }
 #else
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 2499. operator>>(istream&, char*) makes it hard to avoid buffer overflows
diff --git a/libstdc++-v3/testsuite/27_io/basic_istream/extractors_character/char/overflow.cc b/libstdc++-v3/testsuite/27_io/basic_istream/extractors_character/char/overflow.cc
index 1141a41b208..abbba8bb09b 100644
--- a/libstdc++-v3/testsuite/27_io/basic_istream/extractors_character/char/overflow.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_istream/extractors_character/char/overflow.cc
@@ -15,14 +15,14 @@
 // with this library; see the file COPYING3.  If not see
 // <http://www.gnu.org/licenses/>.
 
-// { dg-options "-O2 -std=gnu++98" }
+// { dg-options "-O2" }
 // { dg-do run }
 
 // This test checks that operator>> will avoid a buffer overflow when
 // reading into a buffer with a size that is known at compile time.
 
 // Since C++20 this is guaranteed (see LWG 2499), for previous standards
-// we try to check the buffer size as an extension (which depends on -O2).
+// checking the buffer size is an extension and depends on optimisation.
 
 #include <sstream>
 #include <testsuite_hooks.h>
@@ -30,11 +30,24 @@
 void
 test01()
 {
-  std::istringstream in("foolish child");
+  std::istringstream in("foolishly");
   char pc[5];
   in >> pc;
   VERIFY( in.good() );
   VERIFY( std::string(pc) == "fool" );
+
+#if __cplusplus <= 201703L
+  char* p = pc + 1;
+  in >> p;
+  VERIFY( in.good() );
+  VERIFY( std::string(pc) == "fish" );
+
+  p = pc + 4;
+  *p = '#';
+  in >> p;
+  VERIFY( in.fail() ); // if no characters are extracted, failbit is set
+  VERIFY( *p == '\0' );
+#endif
 }
 
 void
@@ -61,4 +74,6 @@ int
 main()
 {
   test01();
+  test02();
+  test03();
 }
diff --git a/libstdc++-v3/testsuite/27_io/basic_istream/extractors_character/wchar_t/overflow.cc b/libstdc++-v3/testsuite/27_io/basic_istream/extractors_character/wchar_t/overflow.cc
new file mode 100644
index 00000000000..6a23f1305a3
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_istream/extractors_character/wchar_t/overflow.cc
@@ -0,0 +1,57 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-options "-O2" }
+// { dg-do run }
+
+// This test checks that operator>> will avoid a buffer overflow when
+// reading into a buffer with a size that is known at compile time.
+
+// Since C++20 this is guaranteed (see LWG 2499), for previous standards
+// checking the buffer size is an extension and depends on optimisation.
+
+#include <sstream>
+#include <testsuite_hooks.h>
+
+void
+test01()
+{
+  std::wistringstream in(L"foolishly");
+  wchar_t pc[5];
+  in >> pc;
+  VERIFY( in.good() );
+  VERIFY( std::wstring(pc) == L"fool" );
+
+#if __cplusplus <= 201703L
+  wchar_t* p = pc + 1;
+  in >> p;
+  VERIFY( in.good() );
+  VERIFY( std::wstring(pc) == L"fish" );
+
+  p = pc + 4;
+  *p = L'#';
+  in >> p;
+  VERIFY( in.fail() ); // if no characters are extracted, failbit is set
+  VERIFY( *p == L'\0' );
+#endif
+}
+
+int
+main()
+{
+  test01();
+}

Reply via email to