[ https://issues.apache.org/jira/browse/STDCXX-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martin Sebor updated STDCXX-239: -------------------------------- Severity: Usability Affects Version/s: 4.2.0 4.2.1 Remaining Estimate: 16h Original Estimate: 16h Added 4.2 to the list of affected versions and estimated effort. > std::num_get::do_get() cannot parse nan, infinity > ------------------------------------------------- > > Key: STDCXX-239 > URL: https://issues.apache.org/jira/browse/STDCXX-239 > Project: C++ Standard Library > Issue Type: New Feature > Components: 22. Localization > Affects Versions: 4.1.2, 4.1.3, 4.1.4, 4.2.0, 4.2.1 > Environment: all > Reporter: Martin Sebor > Fix For: 4.3 > > Original Estimate: 16h > Remaining Estimate: 16h > > Moved from the Rogue Wave bug tracking database: > ****Created By: sebor @ Apr 04, 2000 07:13:59 PM**** > The num_get<> facet's do_get() members fail to take the special strings > [-]inf[inity] and [-]nan into account. The facet reports an error when it > encounters such strings. See 7.19.6.1 and 7.19.6.2 of C99 for a list of > allowed strings. > The fix for this will not be trivial due to the messy implementation of the > facets. It might be easier just to rewrite them from scratch. > The testcase below demonstrates the incorrect behavior. Modified test case > added as tests/regress/src/test_issue22564.cpp - see p4 describe 22408. > $ g++ ... test.cpp > $ a.out 0 1 inf infinity nan INF INFINITY NAN > sscanf("0", "%lf") --> 0.000000 > num_get<>::do_get("0", ...) --> 0.000000 > sscanf("1", "%lf") --> 1.000000 > num_get<>::do_get("1", ...) --> 1.000000 > sscanf("inf", "%lf") --> inf > num_get<>::do_get("inf", ...) --> error > sscanf("infinity", "%lf") --> inf > num_get<>::do_get("infinity", ...) --> error > sscanf("nan", "%lf") --> nan > num_get<>::do_get("nan", ...) --> error > sscanf("INF", "%lf") --> inf > num_get<>::do_get("INF", ...) --> error > sscanf("INFINITY", "%lf") --> inf > num_get<>::do_get("INFINITY", ...) --> error > sscanf("NAN", "%lf") --> nan > num_get<>::do_get("NAN", ...) --> error > $ cat test.cpp > #include <iostream> > #include <locale> > #include <stdio.h> > #include <string.h> > using namespace std; > int main (int argc, const char *argv[]) > { > num_get<char, const char*> nget; > for (int i = 1; i != argc; ++i) { > double x = 0, y = 0; > ios::iostate err = ios::goodbit; > nget.get (argv [i], argv [i] + strlen (argv [i]), cin, err, x); > if (1 != sscanf (argv [i], "%lf", &y)) > printf ("sscanf(\"%s\", \"%%lf\") --> error\n", argv [i]); > else > printf ("sscanf(\"%s\", \"%%lf\") --> %f\n", argv [i], y); > if ((ios::failbit | ios::badbit) & err) > printf ("num_get<>::do_get(\"%s\", ...) --> error\n", argv [i]); > else > printf ("num_get<>::do_get(\"%s\", ...) --> %f\n", argv [i], x); > } > } > ****Modified By: sebor @ Apr 09, 2000 09:31:49 PM**** > Fixed with p4 describe 22544. Test case fixed with p4 describe 22545. Closed. > ****Modified By: leroy @ Mar 30, 2001 03:09:11 PM**** > Change 22544 by [EMAIL PROTECTED] on 2000/04/09 20:30:50 > Added support for inf[inity] and nan[(n-char-sequence)] as > described > in 7.19.6.1, p8 of C99. > nan(n-char-sequence) currently treated the same as nan due to > poor > implementation of std::num_get<> and supporting classes - fix > requires > at least a partial rewrite of the facet. > Resolves Onyx #22564 (and the duplicate #22601). > Affected files ... > ... //stdlib2/dev/source/src/include/rw/numbrw#17 edit > ... //stdlib2/dev/source/src/include/rw/numbrw.cc#12 edit > ... //stdlib2/dev/source/vendor.cpp#17 edit > ****Modified By: sebor @ Apr 03, 2001 08:46:50 PM**** > It looks like this is actually not a bug and the fix is wrong (even as an > extension). Here's some background... > Subject: Is this a permissible extension? > Date: Thu, 8 Feb 2001 18:16:18 -0500 (EST) > From: Andrew Koenig <[EMAIL PROTECTED]> > Reply-To: [EMAIL PROTECTED] > To: C++ libraries mailing list > Message c++std-lib-8281 > Suppose we execute > double x; > std::cin >> x; > at a point where the input stream contains > NaN > followed perhaps by other characters. > One might plausibly expect an implementation to set x to NaN > on an implementation that supports IEEE floating-point. > Surely the standard cannot mandate such behavior, because not > every implementation knows what NaN is. However, on an implementation > that does support NaN, is such behavior a permitted extension? > My first attempt at an answer is no, because if I track through the > standard, I find that the behavior of this statement is defined > as being identical to the behavior of strtod in c89, and that behavior > requires at least one digit in the input in order for the intput to > be valid. However, I might have missed something. Have I? > ****Modified By: sebor @ Apr 03, 2001 08:48:03 PM**** > Subject: Re: Is this a permissible extension? > Date: Fri, 09 Feb 2001 09:28:25 -0800 > From: Matt Austern <[EMAIL PROTECTED]> > Reply-To: [EMAIL PROTECTED] > Organization: AT&T Labs - Research > References: 1 , 2 > To: C++ libraries mailing list > Message c++std-lib-8284 > Andrew Koenig wrote: > > Fred> In "C" locale, only decimal floating-point constants are valid. > > Fred> So, no NaN nor Infinity is allowed. > > > > Yes -- I was talking about the default locale. > Actually, I think that strtod isn't the important part, at least for > discussing C++. I think that this is an illegal extension in all > named locales. > First, let me explain why I said *named* locales. If you construct > a locale with locale("foo"), the way it works is that the locale is > built up out of _byname facets instead of base class facets. Except > that not all facets have _byname derived classes, so in some cases > you've still got the default behavior from the facet base class. > One of the facets that has no _byname variant is num_get<>. So if I > can construct an argument that the documented behavior of num_get<> > precludes this extension, I have also proved that this extension is > impossible in any named locale. This argument does not apply to > arbitrary locales, since an arbitrary locale may replace any base > class facet that with a facet that inherits from it. > OK, now the argument I promised, saying that num_get<> can't recognize > the character string "NaN". > 22.2.2.1.2, paragraph 2: num_get's overloaded conversion function, > num_get::do_get(), works in three stages. > (1) It determines conversion specifiers. We're OK so far. > (2) It accumulates characters from a provided input character. > (3) It uses the conversion specifiers and the characters it has > accumulated to produce a number. > Stage 2 is the crucial one. it's described in 22.2.2.1.2/8-10, in > great detail. > For each character, > (a) We get it from a supplied input iterator. > (b) We look it up in a lookup table whose contents are prescribed > by the standard. (This has to do with wide characters, but there > is no exception for the special case where you're reading narrow > characters.) > (c) If a character is found in the lookup table, or if it's a decimal > point or a thousands sep, then it's checked to see if it can > legally appear in the number at that point. If so, we keep > acumulating characters. > The characters in the lookup table are "0123456789abcdefABCDEF+-". > Library issue 221 would amend that to "0123456789abcdefxABCDEFX+-". > "N" isn't present in the lookup table, so stage 2 of num_get<>::do_get() > is not permitted to read the character sequence "NaN". > If you want to argue that num_get<>::do_get() is overspecified, I > wouldn't disagree too violently. > --Matt -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.